diff --git a/.pre-commit-config.yaml b/.pre-commit-config.yaml index a8835cdde3..768312c202 100644 --- a/.pre-commit-config.yaml +++ b/.pre-commit-config.yaml @@ -34,6 +34,7 @@ repos: rev: v3.11.2 hooks: - id: markdown-link-check + args: ["-c", "markdown_link_config.json"] - repo: https://github.com/rstcheck/rstcheck rev: v6.2.0 diff --git a/markdown_link_config.json b/markdown_link_config.json new file mode 100644 index 0000000000..f50d947b85 --- /dev/null +++ b/markdown_link_config.json @@ -0,0 +1,7 @@ +{ + "ignorePatterns": [ + { + "pattern": "^/source/server-discovery-and-monitoring/server-discovery-and-monitoring.rst#" + } + ] +} \ No newline at end of file diff --git a/scripts/migrate_to_md.py b/scripts/migrate_to_md.py index aa6550efb8..6ab903767a 100644 --- a/scripts/migrate_to_md.py +++ b/scripts/migrate_to_md.py @@ -92,36 +92,47 @@ # (https://github.com/mongodb/specifications/blob/master/source/...) # and rewrite them to use appropriate md links. # If the link is malformed we ignore and print an error. -pattern = re.compile(f'(<.*{path.parent.name}/{path.name}[>#])') +rel_pattern = re.compile(f'(\.\.\S*/{path.name})') +md_pattern = re.compile(f'(\(http\S*/{path.name})') +rst_pattern = re.compile(f'(`_) +<../connection-monitoring-and-pooling/connection-monitoring-and-pooling.md>`_) support an option for limiting the connection pool size: ``maxPoolSize``. Drivers need to check out a connection before serializing the command. If the diff --git a/source/client-side-encryption/tests/README.rst b/source/client-side-encryption/tests/README.rst index 09880e9458..1f78cf1c4e 100644 --- a/source/client-side-encryption/tests/README.rst +++ b/source/client-side-encryption/tests/README.rst @@ -1133,7 +1133,7 @@ Repeat the steps from the "Via bypassAutoEncryption" test, replacing "bypassAuto 9. Deadlock Tests ~~~~~~~~~~~~~~~~~ -.. _Connection Monitoring and Pooling: /source/connection-monitoring-and-pooling/connection-monitoring-and-pooling.rst +.. _Connection Monitoring and Pooling: ../../connection-monitoring-and-pooling/connection-monitoring-and-pooling.md The following tests only apply to drivers that have implemented a connection pool (see the `Connection Monitoring and Pooling`_ specification). diff --git a/source/command-logging-and-monitoring/command-logging-and-monitoring.rst b/source/command-logging-and-monitoring/command-logging-and-monitoring.rst index 06ddcc7079..7baf69358e 100644 --- a/source/command-logging-and-monitoring/command-logging-and-monitoring.rst +++ b/source/command-logging-and-monitoring/command-logging-and-monitoring.rst @@ -408,7 +408,7 @@ The following key-value pairs MUST be included in all command messages: * - driverConnectionId - Int64 - The driver's ID for the connection used for the command. Note this is NOT the same as ``CommandStartedEvent.connectionId`` defined above, - but refers to the `connectionId` defined in the `connection monitoring and pooling specification <../connection-monitoring-and-pooling/connection-monitoring-and-pooling.rst>`_. + but refers to the `connectionId` defined in the `connection monitoring and pooling specification <../connection-monitoring-and-pooling/connection-monitoring-and-pooling.md>`_. Unlike ``CommandStartedEvent.connectionId`` this field MUST NOT contain the host/port; that information MUST be in the following fields, ``serverHost`` and ``serverPort``. This field is optional for drivers that do not implement CMAP if they do have an equivalent concept of a connection ID. diff --git a/source/connection-monitoring-and-pooling/connection-monitoring-and-pooling.md b/source/connection-monitoring-and-pooling/connection-monitoring-and-pooling.md new file mode 100644 index 0000000000..f58fd97cef --- /dev/null +++ b/source/connection-monitoring-and-pooling/connection-monitoring-and-pooling.md @@ -0,0 +1,1412 @@ +# Connection Monitoring and Pooling + +- Status: Accepted +- Minimum Server Version: N/A + +## Abstract + +Drivers currently support a variety of options that allow users to configure connection pooling behavior. Users are +confused by drivers supporting different subsets of these options. Additionally, drivers implement their connection +pools differently, making it difficult to design cross-driver pool functionality. By unifying and codifying pooling +options and behavior across all drivers, we will increase user comprehension and code base maintainability. + +This specification does not apply to drivers that do not support multitasking. + +## META + +The keywords “MUST”, “MUST NOT”, “REQUIRED”, “SHALL”, “SHALL NOT”, “SHOULD”, “SHOULD NOT”, “RECOMMENDED”, “MAY”, and +“OPTIONAL” in this document are to be interpreted as described in [RFC 2119](https://www.ietf.org/rfc/rfc2119.txt). + +## Definitions + +### Connection + +A Connection (when linked) refers to the `Connection` type defined in the +[Connection Pool Members](#connection-pool-members) section of this specification. It does not refer to an actual TCP +connection to an Endpoint. A `Connection` will attempt to create and wrap such a TCP connection over the course of its +existence, but it is not equivalent to one nor does it wrap an active one at all times. + +For the purposes of testing, a mocked `Connection` type could be used with the pool that never actually creates a TCP +connection or performs any I/O. + +### Endpoint + +For convenience, an Endpoint refers to either a **mongod** or **mongos** instance. + +### Thread + +For convenience, a Thread refers to: + +- A shared-address-space process (a.k.a. a thread) in multi-threaded drivers +- An Execution Frame / Continuation in asynchronous drivers +- A goroutine in Go + +## Behavioral Description + +### Which Drivers this applies to + +This specification is solely concerned with drivers that implement a connection pool. A driver SHOULD implement a +connection pool, but is not required to. + +### Connection Pool Options + +All drivers that implement a connection pool MUST implement and conform to the same MongoClient options. There can be +slight deviation in naming to make the options idiomatic to the driver language. + +### Connection Pool Behaviors + +All driver connection pools MUST provide an API that allows the driver to check out a connection, check in a connection +back to the pool, and clear all connections in the pool. This API is for internal use only, and SHOULD NOT be documented +as a public API. + +### Connection Pool Monitoring + +All drivers that implement a connection pool MUST provide an API that allows users to subscribe to events emitted from +the pool. Conceptually, event emission is instantaneous, i.e., one may talk about the instant an event is emitted, and +represents the start of an activity of delivering the event to a subscribed user. + +## Detailed Design + +### Connection Pool Options + +Drivers that implement a Connection Pool MUST support the following ConnectionPoolOptions: + +```typescript +interface ConnectionPoolOptions { + /** + * The maximum number of Connections that may be associated + * with a pool at a given time. This includes in use and + * available connections. + * If specified, MUST be an integer >= 0. + * A value of 0 means there is no limit. + * Defaults to 100. + */ + maxPoolSize?: number; + + /** + * The minimum number of Connections that MUST exist at any moment + * in a single connection pool. + * If specified, MUST be an integer >= 0. If maxPoolSize is > 0 + * then minPoolSize must be <= maxPoolSize + * Defaults to 0. + */ + minPoolSize?: number; + + /** + * The maximum amount of time a Connection should remain idle + * in the connection pool before being marked idle. + * If specified, MUST be a number >= 0. + * A value of 0 means there is no limit. + * Defaults to 0. + */ + maxIdleTimeMS?: number; + + /** + * The maximum number of Connections a Pool may be establishing concurrently. + * Establishment of a Connection is a part of its life cycle + * starting after a ConnectionCreatedEvent and ending before a ConnectionReadyEvent. + * If specified, MUST be a number > 0. + * Defaults to 2. + */ + maxConnecting?: number; +} +``` + +Additionally, Drivers that implement a Connection Pool MUST support the following ConnectionPoolOptions UNLESS that +driver meets ALL of the following conditions: + +- The driver/language currently has an idiomatic timeout mechanism implemented +- The timeout mechanism conforms to [the aggressive requirement of timing out a thread in the WaitQueue](#waitqueue) + +```typescript +interface ConnectionPoolOptions { + /** + * NOTE: This option has been deprecated in favor of timeoutMS. + * + * The maximum amount of time a thread can wait for + * either an available non-perished connection (limited by `maxPoolSize`), + * or a pending connection (limited by `maxConnecting`). + * If specified, MUST be a number >= 0. + * A value of 0 means there is no limit. + * Defaults to 0. + */ + waitQueueTimeoutMS?: number; +} +``` + +These options MUST be specified at the MongoClient level, and SHOULD be named in a manner idiomatic to the driver's +language. All connection pools created by a MongoClient MUST use the same ConnectionPoolOptions. + +When parsing a mongodb connection string, a user MUST be able to specify these options using the default names specified +above. + +#### Deprecated Options + +The following ConnectionPoolOptions are considered deprecated. They MUST NOT be implemented if they do not already exist +in a driver, and they SHOULD be deprecated and removed from drivers that implement them as early as possible: + +```typescript +interface ConnectionPoolOptions { + /** + * The maximum number of threads that can simultaneously wait + * for a Connection to become available. + */ + waitQueueSize?: number; + + /** + * An alternative way of setting waitQueueSize, it specifies + * the maximum number of threads that can wait per connection. + * waitQueueSize === waitQueueMultiple \* maxPoolSize + */ + waitQueueMultiple?: number +} +``` + +### Connection Pool Members + +#### Connection + +A driver-defined wrapper around a single TCP connection to an Endpoint. A [Connection](#connection-1) has the following +properties: + +- **Single Endpoint:** A [Connection](#connection-1) MUST be associated with a single Endpoint. A + [Connection](#connection-1) MUST NOT be associated with multiple Endpoints. +- **Single Lifetime:** A [Connection](#connection-1) MUST NOT be used after it is closed. +- **Single Owner:** A [Connection](#connection-1) MUST belong to exactly one Pool, and MUST NOT be shared across + multiple pools +- **Single Track:** A [Connection](#connection-1) MUST limit itself to one request / response at a time. A + [Connection](#connection-1) MUST NOT multiplex/pipeline requests to an Endpoint. +- **Monotonically Increasing ID:** A [Connection](#connection-1) MUST have an ID number associated with it. + [Connection](#connection-1) IDs within a Pool MUST be assigned in order of creation, starting at 1 and increasing by 1 + for each new Connection. +- **Valid Connection:** A connection MUST NOT be checked out of the pool until it has successfully and fully completed a + MongoDB Handshake and Authentication as specified in the + [Handshake](https://github.com/mongodb/specifications/blob/master/source/mongodb-handshake/handshake.rst), + [OP_COMPRESSED](https://github.com/mongodb/specifications/blob/master/source/compression/OP_COMPRESSED.rst), and + [Authentication](https://github.com/mongodb/specifications/blob/master/source/auth/auth.rst) specifications. +- **Perishable**: it is possible for a [Connection](#connection-1) to become **Perished**. A [Connection](#connection-1) + is considered perished if any of the following are true: + - **Stale:** The [Connection](#connection-1) 's generation does not match the generation of the parent pool + - **Idle:** The [Connection](#connection-1) is currently "available" (as defined below) and has been for longer than + **maxIdleTimeMS**. + - **Errored:** The [Connection](#connection-1) has experienced an error that indicates it is no longer recommended for + use. Examples include, but are not limited to: + - Network Error + - Network Timeout + - Endpoint closing the connection + - Driver-Side Timeout + - Wire-Protocol Error + +```typescript +interface Connection { + /** + * An id number associated with the Connection + */ + id: number; + + /** + * The address of the pool that owns this Connection + */ + address: string; + + /** + * An integer representing the “generation” of the pool + * when this Connection was created. + */ + generation: number; + + /** + * The current state of the Connection. + * + * Possible values are the following: + * - "pending": The Connection has been created but has not yet been established. Contributes to + * totalConnectionCount and pendingConnectionCount. + * + * - "available": The Connection has been established and is waiting in the pool to be checked + * out. Contributes to both totalConnectionCount and availableConnectionCount. + * + * - "in use": The Connection has been established, checked out from the pool, and has yet + * to be checked back in. Contributes to totalConnectionCount. + * + * - "closed": The Connection has had its socket closed and cannot be used for any future + * operations. Does not contribute to any connection counts. + * + * Note: this field is mainly used for the purposes of describing state + * in this specification. It is not required that drivers + * actually include this field in their implementations of Connection. + */ + state: "pending" | "available" | "in use" | "closed"; +} +``` + +#### WaitQueue + +A concept that represents pending requests for [Connections](#connection). When a thread requests a +[Connection](#connection) from a Pool, the thread enters the Pool's WaitQueue. A thread stays in the WaitQueue until it +either receives a [Connection](#connection) or times out. A WaitQueue has the following traits: + +- **Thread-Safe**: When multiple threads attempt to enter or exit a WaitQueue, they do so in a thread-safe manner. +- **Ordered/Fair**: When [Connections](#connection) are made available, they are issued out to threads in the order that + the threads entered the WaitQueue. +- **Timeout aggressively:** Members of a WaitQueue MUST timeout if they are enqueued for longer than the computed + timeout and MUST leave the WaitQueue immediately in this case. + +The implementation details of a WaitQueue are left to the driver. Example implementations include: + +- A fair Semaphore +- A Queue of callbacks + +#### Connection Pool + +A driver-defined entity that encapsulates all non-monitoring [Connections](#connection) associated with a single +Endpoint. The pool has the following properties: + +- **Thread Safe:** All Pool behaviors MUST be thread safe. +- **Not Fork-Safe:** A Pool is explicitly not fork-safe. If a Pool detects that is it being used by a forked process, it + MUST immediately clear itself and update its pid +- **Single Owner:** A Pool MUST be associated with exactly one Endpoint, and MUST NOT be shared between Endpoints. +- **Emit Events and Log Messages:** A Pool MUST emit pool events and log messages when dictated by this spec (see + [Connection Pool Monitoring](#connection-pool-monitoring)). Users MUST be able to subscribe to emitted events and log + messages in a manner idiomatic to their language and driver. +- **Closeable:** A Pool MUST be able to be manually closed. When a Pool is closed, the following behaviors change: + - Checking in a [Connection](#connection) to the Pool automatically closes the [Connection](#connection) + - Attempting to check out a [Connection](#connection) from the Pool results in an Error +- **Clearable:** A Pool MUST be able to be cleared. Clearing the pool marks all pooled and checked out + [Connections](#connection) as stale and lazily closes them as they are checkedIn or encountered in checkOut. + Additionally, all requests are evicted from the WaitQueue and return errors that are considered non-timeout network + errors. +- **Pausable:** A Pool MUST be able to be paused and resumed. A Pool is paused automatically when it is cleared, and it + can be resumed by being marked as "ready". While the Pool is paused, it exhibits the following behaviors: + - Attempting to check out a [Connection](#connection) from the Pool results in a non-timeout network error + - Connections are not created in the background to satisfy minPoolSize +- **Capped:** a pool is capped if **maxPoolSize** is set to a non-zero value. If a pool is capped, then its total number + of [Connections](#connection) (including available and in use) MUST NOT exceed **maxPoolSize** +- **Rate-limited:** A Pool MUST limit the number of [Connections](#connection) being + [established](#establishing-a-connection-internal-implementation) concurrently via the **maxConnecting** + [pool option](#connection-pool-options-1). + +```typescript +interface ConnectionPool { + /** + * The Queue of threads waiting for a Connection to be available + */ + waitQueue: WaitQueue; + + /** + * A generation number representing the SDAM generation of the pool. + */ + generation: number; + + /** + * A map representing the various generation numbers for various services + * when in load balancer mode. + */ + serviceGenerations: Map; + + /** + * The state of the pool. + * + * Possible values are the following: + * - "paused": The initial state of the pool. Connections may not be checked out nor can they + * be established in the background to satisfy minPoolSize. Clearing a pool + * transitions it to this state. + * + * - "ready": The healthy state of the pool. It can service checkOut requests and create + * connections in the background. The pool can be set to this state via the + * ready() method. + * + * - "closed": The pool is destroyed. No more Connections may ever be checked out nor any + * created in the background. The pool can be set to this state via the close() + * method. The pool cannot transition to any other state after being closed. + */ + state: "paused" | "ready" | "closed"; + + // Any of the following connection counts may be computed rather than + // actually stored on the pool. + + /** + * An integer expressing how many total Connections + * ("pending" + "available" + "in use") the pool currently has + */ + totalConnectionCount: number; + + /** + * An integer expressing how many Connections are currently + * available in the pool. + */ + availableConnectionCount: number; + + /** + * An integer expressing how many Connections are currently + * being established. + */ + pendingConnectionCount: number; + + /** + * Returns a Connection for use + */ + checkOut(): Connection; + + /** + * Check in a Connection back to the Connection pool + */ + checkIn(connection: Connection): void; + + /** + * Mark all current Connections as stale, clear the WaitQueue, and mark the pool as "paused". + * No connections may be checked out or created in this pool until ready() is called again. + * interruptInUseConnections specifies whether the pool will force interrupt "in use" connections as part of the clear. + * Default false. + */ + clear(interruptInUseConnections: Optional): void; + + /** + * Mark the pool as "ready", allowing checkOuts to resume and connections to be created in the background. + * A pool can only transition from "paused" to "ready". A "closed" pool + * cannot be marked as "ready" via this method. + */ + ready(): void; + + /** + * Marks the pool as "closed", preventing the pool from creating and returning new Connections + */ + close(): void; +} +``` + +### Connection Pool Behaviors + +#### Creating a Connection Pool + +This specification does not define how a pool is to be created, leaving it up to the driver. Creation of a connection +pool is generally an implementation detail of the driver, i.e., is not a part of the public API of the driver. The SDAM +specification defines +[when](https://github.com/mongodb/specifications/blob/master/source/server-discovery-and-monitoring/server-discovery-and-monitoring.rst#connection-pool-creation) +the driver should create connection pools. + +When a pool is created, its state MUST initially be set to "paused". Even if minPoolSize is set, the pool MUST NOT begin +being [populated](#populating-the-pool-with-a-connection-internal-implementation) with [Connections](#connection) until +it has been marked as "ready". SDAM will mark the pool as "ready" on each successful check. See +[Connection Pool Management](/source/server-discovery-and-monitoring/server-discovery-and-monitoring.rst#connection-pool-management) +section in the SDAM specification for more information. + +``` +set generation to 0 +set state to "paused" +emit PoolCreatedEvent and equivalent log message +``` + +#### Closing a Connection Pool + +When a pool is closed, it MUST first close all available [Connections](#connection) in that pool. This results in the +following behavior changes: + +- In use [Connections](#connection) MUST be closed when they are checked in to the closed pool. +- Attempting to check out a [Connection](#connection) MUST result in an error. + +``` +mark pool as "closed" +for connection in availableConnections: + close connection +emit PoolClosedEvent and equivalent log message +``` + +#### Marking a Connection Pool as Ready + +Connection Pools start off as "paused", and they are marked as "ready" by monitors after they perform successful server +checks. Once a pool is "ready", it can start checking out [Connections](#connection) and populating them in the +background. + +If the pool is already "ready" when this method is invoked, then this method MUST immediately return and MUST NOT emit a +PoolReadyEvent. + +``` +mark pool as "ready" +emit PoolReadyEvent and equivalent log message +allow background thread to create connections +``` + +Note that the PoolReadyEvent MUST be emitted before the background thread is allowed to resume creating new connections, +and it must be the case that no observer is able to observe actions of the background thread related to creating new +connections before observing the PoolReadyEvent event. + +#### Creating a Connection (Internal Implementation) + +When creating a [Connection](#connection), the initial [Connection](#connection) is in a “pending” state. This only +creates a “virtual” [Connection](#connection), and performs no I/O. + +``` +connection = new Connection() +increment totalConnectionCount +increment pendingConnectionCount +set connection state to "pending" +tConnectionCreated = current instant (use a monotonic clock if possible) +emit ConnectionCreatedEvent and equivalent log message +return connection +``` + +#### Establishing a Connection (Internal Implementation) + +Before a [Connection](#connection) can be marked as either "available" or "in use", it must be established. This process +involves performing the initial handshake, handling OP_COMPRESSED, and performing authentication. + +``` +try: + connect connection via TCP / TLS + perform connection handshake + handle OP_COMPRESSED + perform connection authentication + tConnectionReady = current instant (use a monotonic clock if possible) + emit ConnectionReadyEvent(duration = tConnectionReady - tConnectionCreated) and equivalent log message + return connection +except error: + close connection + throw error # Propagate error in manner idiomatic to language. +``` + +#### Closing a Connection (Internal Implementation) + +When a [Connection](#connection) is closed, it MUST first be marked as "closed", removing it from being counted as +"available" or "in use". Once that is complete, the [Connection](#connection) can perform whatever teardown is necessary +to close its underlying socket. The Driver SHOULD perform this teardown in a non-blocking manner, such as via the use of +a background thread or async I/O. + +``` +original state = connection state +set connection state to "closed" + +if original state is "available": + decrement availableConnectionCount +else if original state is "pending": + decrement pendingConnectionCount + +decrement totalConnectionCount +emit ConnectionClosedEvent and equivalent log message + +# The following can happen at a later time (i.e. in background +# thread) or via non-blocking I/O. +connection.socket.close() +``` + +#### Marking a Connection as Available (Internal Implementation) + +A [Connection](#connection) is "available" if it is able to be checked out. A [Connection](#connection) MUST NOT be +marked as "available" until it has been established. The pool MUST keep track of the number of currently available +[Connections](#connection). + +``` +increment availableConnectionCount +set connection state to "available" +add connection to availableConnections +``` + +#### Populating the Pool with a Connection (Internal Implementation) + +"Populating" the pool involves preemptively creating and establishing a [Connection](#connection) which is marked as +"available" for use in future operations. + +Populating the pool MUST NOT block any application threads. For example, it could be performed on a background thread or +via the use of non-blocking/async I/O. Populating the pool MUST NOT be performed unless the pool is "ready". + +If an error is encountered while populating a connection, it MUST be handled via the SDAM machinery according to the +[Application Errors](/source/server-discovery-and-monitoring/server-discovery-and-monitoring.rst#application-errors) +section in the SDAM specification. + +If minPoolSize is set, the [Connection](#connection) Pool MUST be populated until it has at least minPoolSize total +[Connections](#connection). This MUST occur only while the pool is "ready". If the pool implements a background thread, +it can be used for this. If the pool does not implement a background thread, the checkOut method is responsible for +ensuring this requirement is met. + +When populating the Pool, pendingConnectionCount has to be decremented after establishing a [Connection](#connection-1) +similarly to how it is done in [Checking Out a Connection](#checking-out-a-connection) to signal that another +[Connection](#connection-1) is allowed to be established. Such a signal MUST become observable to any [Thread](#thread) +after the action that +[marks the established Connection as "available"](#marking-a-connection-as-available-internal-implementation) becomes +observable to the [Thread](#thread). Informally, this order guarantees that no [Thread](#thread) tries to start +establishing a [Connection](#connection-1) when there is an "available" [Connection](#connection-1) established as a +result of populating the Pool. + +``` +wait until pendingConnectionCount < maxConnecting and pool is "ready" +create connection +try: + establish connection + mark connection as available +except error: + # Defer error handling to SDAM. + topology.handle_pre_handshake_error(error) +``` + +#### Checking Out a Connection + +A Pool MUST have a method that allows the driver to check out a [Connection](#connection-1). Checking out a +[Connection](#connection-1) involves submitting a request to the WaitQueue and, once that request reaches the front of +the queue, having the Pool find or create a [Connection](#connection-1) to fulfill that request. Requests MUST be +subject to a timeout which is computed per the rules in +[Client Side Operations Timeout: Server Selection](../client-side-operations-timeout/client-side-operations-timeout.md#server-selection). + +To service a request for a [Connection](#connection-1), the Pool MUST first iterate over the list of available +[Connections](#connection), searching for a non-perished one to be returned. If a perished [Connection](#connection-1) +is encountered, such a [Connection](#connection-1) MUST be closed (as described in +[Closing a Connection](#closing-a-connection-internal-implementation)) and the iteration of available +[Connections](#connection) MUST continue until either a non-perished available [Connection](#connection-1) is found or +the list of available [Connections](#connection) is exhausted. + +If the list is exhausted, the total number of [Connections](#connection) is less than maxPoolSize, and +pendingConnectionCount \< maxConnecting, the pool MUST create a [Connection](#connection-1), establish it, mark it as +"in use" and return it. If totalConnectionCount == maxPoolSize or pendingConnectionCount == maxConnecting, then the pool +MUST wait to service the request until neither of those two conditions are met or until a [Connection](#connection-1) +becomes available, re-entering the checkOut loop in either case. This waiting MUST NOT prevent +[Connections](#connection) from being checked into the pool. Additionally, the Pool MUST NOT service any newer checkOut +requests before fulfilling the original one which could not be fulfilled. For drivers that implement the WaitQueue via a +fair semaphore, a condition variable may also be needed to to meet this requirement. Waiting on the condition variable +SHOULD also be limited by the WaitQueueTimeout, if the driver supports one and it was specified by the user. + +If the pool is "closed" or "paused", any attempt to check out a [Connection](#connection) MUST throw an Error. The error +thrown as a result of the pool being "paused" MUST be considered a retryable error and MUST NOT be an error that marks +the SDAM state unknown. + +If the pool does not implement a background thread, the checkOut method is responsible for ensuring that the pool is +[populated](#populating-the-pool-with-a-connection-internal-implementation) with at least minPoolSize +[Connections](#connection). + +A [Connection](#connection) MUST NOT be checked out until it is established. In addition, the Pool MUST NOT prevent +other threads from checking out [Connections](#connection) while establishing a [Connection](#connection). + +Before a given [Connection](#connection) is returned from checkOut, it must be marked as "in use", and the pool's +availableConnectionCount MUST be decremented. + +```python +connection = Null +tConnectionCheckOutStarted = current instant (use a monotonic clock if possible) +emit ConnectionCheckOutStartedEvent and equivalent log message +try: + enter WaitQueue + wait until at top of wait queue + # Note that in a lock-based implementation of the wait queue would + # only allow one thread in the following block at a time + while connection is Null: + if a connection is available: + while connection is Null and a connection is available: + connection = next available connection + if connection is perished: + close connection + connection = Null + else if totalConnectionCount < maxPoolSize: + if pendingConnectionCount < maxConnecting: + connection = create connection + else: + # this waiting MUST NOT prevent other threads from checking Connections + # back in to the pool. + wait until pendingConnectionCount < maxConnecting or a connection is available + continue + +except pool is "closed": + tConnectionCheckOutFailed = current instant (use a monotonic clock if possible) + emit ConnectionCheckOutFailedEvent(reason="poolClosed", duration = tConnectionCheckOutFailed - tConnectionCheckOutStarted) and equivalent log message + throw PoolClosedError +except pool is "paused": + tConnectionCheckOutFailed = current instant (use a monotonic clock if possible) + emit ConnectionCheckOutFailedEvent(reason="connectionError", duration = tConnectionCheckOutFailed - tConnectionCheckOutStarted) and equivalent log message + throw PoolClearedError +except timeout: + tConnectionCheckOutFailed = current instant (use a monotonic clock if possible) + emit ConnectionCheckOutFailedEvent(reason="timeout", duration = tConnectionCheckOutFailed - tConnectionCheckOutStarted) and equivalent log message + throw WaitQueueTimeoutError +finally: + # This must be done in all drivers + leave wait queue + +# If the Connection has not been established yet (TCP, TLS, +# handshake, compression, and auth), it must be established +# before it is returned. +# This MUST NOT block other threads from acquiring connections. +if connection state is "pending": + try: + establish connection + except connection establishment error: + tConnectionCheckOutFailed = current instant (use a monotonic clock if possible) + emit ConnectionCheckOutFailedEvent(reason="connectionError", duration = tConnectionCheckOutFailed - tConnectionCheckOutStarted) and equivalent log message + decrement totalConnectionCount + throw + finally: + decrement pendingConnectionCount +else: + decrement availableConnectionCount +set connection state to "in use" + +# If there is no background thread, the pool MUST ensure that +# there are at least minPoolSize total connections. +do asynchronously: + while totalConnectionCount < minPoolSize: + populate the pool with a connection + +tConnectionCheckedOut = current instant (use a monotonic clock if possible) +emit ConnectionCheckedOutEvent(duration = tConnectionCheckedOut - tConnectionCheckOutStarted) and equivalent log message +return connection +``` + +#### Checking In a Connection + +A Pool MUST have a method of allowing the driver to check in a [Connection](#connection). The driver MUST NOT be allowed +to check in a [Connection](#connection) to a Pool that did not create that [Connection](#connection), and MUST throw an +Error if this is attempted. + +When the [Connection](#connection) is checked in, it MUST be [closed](#closing-a-connection-internal-implementation) if +any of the following are true: + +- The [Connection](#connection) is perished. +- The pool has been closed. + +Otherwise, the [Connection](#connection) is marked as available. + +``` +emit ConnectionCheckedInEvent and equivalent log message +if connection is perished OR pool is closed: + close connection +else: + mark connection as available +``` + +#### Clearing a Connection Pool + +Clearing the pool involves different steps depending on whether the pool is in load balanced mode or not. The +traditional / non-load balanced clearing behavior MUST NOT be used by pools in load balanced mode, and the load balanced +pool clearing behavior MUST NOT be used in non-load balanced pools. + +##### Clearing a non-load balanced pool + +A Pool MUST have a method of clearing all [Connections](#connection) when instructed. Rather than iterating through +every [Connection](#connection), this method should simply increment the generation of the Pool, implicitly marking all +current [Connections](#connection) as stale. It should also transition the pool's state to "paused" to halt the creation +of new connections until it is marked as "ready" again. The checkOut and checkIn algorithms will handle clearing out +stale [Connections](#connection). If a user is subscribed to Connection Monitoring events and/or connection log +messages, a PoolClearedEvent and log message MUST be emitted after incrementing the generation / marking the pool as +"paused". If the pool is already "paused" when it is cleared, then the pool MUST NOT emit a PoolCleared event or log +message. + +As part of clearing the pool, the WaitQueue MUST also be cleared, meaning all requests in the WaitQueue MUST fail with +errors indicating that the pool was cleared while the checkOut was being performed. The error returned as a result of +the pool being cleared MUST be considered a retryable error and MUST NOT be an error that marks the SDAM state unknown. +Clearing the WaitQueue MUST happen eagerly so that any operations waiting on [Connections](#connection) can retry as +soon as possible. The pool MUST NOT rely on WaitQueueTimeoutMS to clear requests from the WaitQueue. + +The clearing method MUST provide the option to interrupt any in-use connections as part of the clearing (henceforth +referred to as the interruptInUseConnections flag in this specification). "Interrupting a Connection" is defined as +canceling whatever task the Connection is currently performing and marking the Connection as perished (e.g. by closing +its underlying socket). The interrupting of these Connections MUST be performed as soon as possible but MUST NOT block +the pool or prevent it from processing further requests. If the pool has a background thread, and it is responsible for +interrupting in-use connections, its next run MUST be scheduled as soon as possible. + +The pool MUST only interrupt in-use Connections whose generation is less than or equal to the generation of the pool at +the moment of the clear (before the increment) that used the interruptInUseConnections flag. Any operations that have +their Connections interrupted in this way MUST fail with a retryable error. If possible, the error SHOULD be a +PoolClearedError with the following message: "Connection to \ interrupted due to server monitor timeout". + +##### Clearing a load balanced pool + +A Pool MUST also have a method of clearing all [Connections](#connection) for a specific `serviceId` for use when in +load balancer mode. This method increments the generation of the pool for that specific `serviceId` in the generation +map. A PoolClearedEvent and log message MUST be emitted after incrementing the generation. Note that this method MUST +NOT transition the pool to the "paused" state and MUST NOT clear the WaitQueue. + +#### Load Balancer Mode + +For load-balanced deployments, pools MUST maintain a map from `serviceId` to a tuple of (generation, connection count) +where the connection count refers to the total number of connections that exist for a specific `serviceId`. The pool +MUST remove the entry for a `serviceId` once the connection count reaches 0. Once the MongoDB handshake is done, the +connection MUST get the generation number that applies to its `serviceId` from the map and update the map to increment +the connection count for this `serviceId`. + +See the [Load Balancer Specification](../load-balancers/load-balancers.rst#connection-pooling) for details. + +#### Forking + +A [Connection](#connection) is explicitly not fork-safe. The proper behavior in the case of a fork is to ResetAfterFork +by: + +- clear all Connection Pools in the child process +- closing all [Connections](#connection) in the child-process. + +Drivers that support forking MUST document that [Connections](#connection) to an Endpoint are not fork-safe, and +document the proper way to ResetAfterFork in the driver. + +Drivers MAY aggressively ResetAfterFork if the driver detects it has been forked. + +#### Optional Behaviors + +The following features of a Connection Pool SHOULD be implemented if they make sense in the driver and driver's +language. + +##### Background Thread + +A Pool SHOULD have a background Thread that is responsible for monitoring the state of all available +[Connections](#connection). This background thread SHOULD + +- Populate [Connections](#connection) to ensure that the pool always satisfies minPoolSize. +- Remove and close perished available [Connections](#connection) including "in use" connections if + `interruptInUseConnections` option was set to true in the most recent pool clear. +- Apply timeouts to connection establishment per + [Client Side Operations Timeout: Background Connection Pooling](../client-side-operations-timeout/client-side-operations-timeout.md#background-connection-pooling). + +A pool SHOULD allow immediate scheduling of the next background thread iteration after a clear is performed. + +Conceptually, the aforementioned activities are organized into sequential Background Thread Runs. A Run MUST do as much +work as readily available and then end instead of waiting for more work. For example, instead of waiting for +pendingConnectionCount to become less than maxConnecting when satisfying minPoolSize, a Run MUST either proceed with the +rest of its duties, e.g., closing available perished connections, or end. + +The duration of intervals between the end of one Run and the beginning of the next Run is not specified, but the +[Test Format and Runner Specification](https://github.com/mongodb/specifications/tree/master/source/connection-monitoring-and-pooling/tests) +may restrict this duration, or introduce other restrictions to facilitate testing. + +##### withConnection + +A Pool SHOULD implement a scoped resource management mechanism idiomatic to their language to prevent +[Connections](#connection) from not being checked in. Examples include +[Python's "with" statement](https://docs.python.org/3/whatsnew/2.6.html#pep-343-the-with-statement) and +[C#'s "using" statement](https://docs.microsoft.com/en-us/dotnet/csharp/language-reference/keywords/using-statement). If +implemented, drivers SHOULD use this method as the default method of checking out and checking in +[Connections](#connection). + +### Connection Pool Monitoring + +All drivers that implement a connection pool MUST provide an API that allows users to subscribe to events emitted from +the pool. If a user subscribes to Connection Monitoring events, these events MUST be emitted when specified in +“Connection Pool Behaviors”. Events SHOULD be created and subscribed to in a manner idiomatic to their language and +driver. + +#### Events + +See the [Load Balancer Specification](../load-balancers/load-balancers.rst#events) for details on the `serviceId` field. + +```typescript +/** + * Emitted when a Connection Pool is created + */ +interface PoolCreatedEvent { + /** + * The ServerAddress of the Endpoint the pool is attempting to connect to. + */ + address: string; + + /** + * Any non-default pool options that were set on this Connection Pool. + */ + options: {...} +} + +/** + * Emitted when a Connection Pool is marked as ready. + */ +interface PoolReadyEvent { + /** + * The ServerAddress of the Endpoint the pool is attempting to connect to. + */ + address: string; +} + +/** + * Emitted when a Connection Pool is cleared + */ +interface PoolClearedEvent { + /** + * The ServerAddress of the Endpoint the pool is attempting to connect to. + */ + address: string; + + /** + * The service id for which the pool was cleared for in load balancing mode. + * See load balancer specification for more information about this field. + */ + serviceId: Optional; + + /** + * A flag whether the pool forced interrupting "in use" connections as part of the clear. + */ + interruptInUseConnections: Optional; +} + +/** + * Emitted when a Connection Pool is closed + */ +interface PoolClosedEvent { + /** + * The ServerAddress of the Endpoint the pool is attempting to connect to. + */ + address: string; +} + +/** + * Emitted when a Connection Pool creates a Connection object. + * NOTE: This does not mean that the Connection is ready for use. + */ +interface ConnectionCreatedEvent { + /** + * The ServerAddress of the Endpoint the pool is attempting to connect to. + */ + address: string; + + /** + * The ID of the Connection + */ + connectionId: int64; +} + +/** + * Emitted when a Connection has finished its setup, and is now ready to use + */ +interface ConnectionReadyEvent { + /** + * The ServerAddress of the Endpoint the pool is attempting to connect to. + */ + address: string; + + /** + * The ID of the Connection + */ + connectionId: int64; + + /** + * The time it took to establish the connection. + * In accordance with the definition of establishment of a connection + * specified by `ConnectionPoolOptions.maxConnecting`, + * it is the time elapsed between emitting a `ConnectionCreatedEvent` + * and emitting this event as part of the same checking out. + * + * Naturally, when establishing a connection is part of checking out, + * this duration is not greater than + * `ConnectionCheckedOutEvent`/`ConnectionCheckOutFailedEvent.duration`. + * + * A driver MAY choose the type idiomatic to the driver. + * If the type chosen does not convey units, e.g., `int64`, + * then the driver MAY include units in the name, e.g., `durationMS`. + */ + duration: Duration; +} + +/** + * Emitted when a Connection Pool closes a Connection + */ +interface ConnectionClosedEvent { + /** + * The ServerAddress of the Endpoint the pool is attempting to connect to. + */ + address: string; + + /** + * The ID of the Connection + */ + connectionId: int64; + + /** + * A reason explaining why this Connection was closed. + * Can be implemented as a string or enum. + * Current valid values are: + * - "stale": The pool was cleared, making the Connection no longer valid + * - "idle": The Connection became stale by being available for too long + * - "error": The Connection experienced an error, making it no longer valid + * - "poolClosed": The pool was closed, making the Connection no longer valid + */ + reason: string|Enum; +} + +/** + * Emitted when the driver starts attempting to check out a Connection + */ +interface ConnectionCheckOutStartedEvent { + /** + * The ServerAddress of the Endpoint the pool is attempting + * to connect to. + */ + address: string; +} + +/** + * Emitted when the driver's attempt to check out a Connection fails + */ +interface ConnectionCheckOutFailedEvent { + /** + * The ServerAddress of the Endpoint the pool is attempting to connect to. + */ + address: string; + + /** + * A reason explaining why Connection check out failed. + * Can be implemented as a string or enum. + * Current valid values are: + * - "poolClosed": The pool was previously closed, and cannot provide new Connections + * - "timeout": The Connection check out attempt exceeded the specified timeout + * - "connectionError": The Connection check out attempt experienced an error while setting up a new Connection + */ + reason: string|Enum; + + /** + * See `ConnectionCheckedOutEvent.duration`. + */ + duration: Duration; +} + +/** + * Emitted when the driver successfully checks out a Connection + */ +interface ConnectionCheckedOutEvent { + /** + * The ServerAddress of the Endpoint the pool is attempting to connect to. + */ + address: string; + + /** + * The ID of the Connection + */ + connectionId: int64; + + /** + * The time it took to check out the connection. + * More specifically, the time elapsed between + * emitting a `ConnectionCheckOutStartedEvent` + * and emitting this event as part of the same checking out. + * + * Naturally, if a new connection was not created (`ConnectionCreatedEvent`) + * and established (`ConnectionReadyEvent`) as part of checking out, + * this duration is usually + * not greater than `ConnectionPoolOptions.waitQueueTimeoutMS`, + * but MAY occasionally be greater than that, + * because a driver does not provide hard real-time guarantees. + * + * A driver MAY choose the type idiomatic to the driver. + * If the type chosen does not convey units, e.g., `int64`, + * then the driver MAY include units in the name, e.g., `durationMS`. + */ + duration: Duration; +} + +/** + * Emitted when the driver checks in a Connection back to the Connection Pool + */ +interface ConnectionCheckedInEvent { + /** + * The ServerAddress of the Endpoint the pool is attempting to connect to. + */ + address: string; + + /** + * The ID of the Connection + */ + connectionId: int64; +} +``` + +### Connection Pool Logging + +Please refer to the [logging specification](../logging/logging.rst) for details on logging implementations in general, +including log levels, log components, handling of null values in log messages, and structured versus unstructured +logging. + +Drivers MUST support logging of connection pool information via the following types of log messages. These messages MUST +be logged at `Debug` level and use the `connection` log component. These messages MUST be emitted when specified in +“Connection Pool Behaviors”. + +The log messages are intended to match the information contained in the events above. Drivers MAY implement connection +logging support via an event subscriber if it is convenient to do so. + +The types used in the structured message definitions below are demonstrative, and drivers MAY use similar types instead +so long as the information is present (e.g. a double instead of an integer, or a string instead of an integer if the +structured logging framework does not support numeric types). + +#### Common Fields + +All connection log messages MUST contain the following key-value pairs: + +| Key | Suggested Type | Value | +| ---------- | -------------- | --------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | +| serverHost | String | the hostname, IP address, or Unix domain socket path for the endpoint the pool is for. | +| serverPort | Int | The port for the endpoint the pool is for. Optional; not present for Unix domain sockets. When the user does not specify a port and the default (27017) is used, the driver SHOULD include it here. | + +#### Pool Created Message + +In addition to the common fields defined above, this message MUST contain the following key-value pairs: + +| Key | Suggested Type | Value | +| ------------------ | -------------- | ------------------------------------------------------------------------------------------------------------------------------------------------- | +| message | String | "Connection pool created" | +| maxIdleTimeMS | Int | The maxIdleTimeMS value for this pool. Optional; only required to include if the user specified a value. | +| minPoolSize | Int | The minPoolSize value for this pool. Optional; only required to include if the user specified a value. | +| maxPoolSize | Int | The maxPoolSize value for this pool. Optional; only required to include if the user specified a value. | +| maxConnecting | Int | The maxConnecting value for this pool. Optional; only required to include if the driver supports this option and the user specified a value. | +| waitQueueTimeoutMS | Int | The waitQueueTimeoutMS value for this pool. Optional; only required to include if the driver supports this option and the user specified a value. | +| waitQueueSize | Int | The waitQueueSize value for this pool. Optional; only required to include if the driver supports this option and the user specified a value. | +| waitQueueMultiple | Int | The waitQueueMultiple value for this pool. Optional; only required to include if the driver supports this option and the user specified a value. | + +The unstructured form SHOULD be as follows, using the values defined in the structured format above to fill in +placeholders as appropriate: + +> Connection pool created for {{serverHost}}:{{serverPort}} using options maxIdleTimeMS={{maxIdleTimeMS}}, +> minPoolSize={{minPoolSize}}, maxPoolSize={{maxPoolSize}}, maxConnecting={{maxConnecting}}, +> waitQueueTimeoutMS={{waitQueueTimeoutMS}}, waitQueueSize={{waitQueueSize}}, waitQueueMultiple={{waitQueueMultiple}} + +#### Pool Ready Message + +In addition to the common fields defined above, this message MUST contain the following key-value pairs: + +| Key | Suggested Type | Value | +| ------- | -------------- | ----------------------- | +| message | String | "Connection pool ready" | + +The unstructured form SHOULD be as follows, using the values defined in the structured format above to fill in +placeholders as appropriate: + +> Connection pool ready for {{serverHost}}:{{serverPort}} + +#### Pool Cleared Message + +In addition to the common fields defined above, this message MUST contain the following key-value pairs: + +| Key | Suggested Type | Value | +| --------- | -------------- | ----------------------------------------------------------------------------------------------------------------------------- | +| message | String | "Connection pool cleared" | +| serviceId | String | The hex string representation of the service ID which the pool was cleared for. Optional; only present in load balanced mode. | + +The unstructured form SHOULD be as follows, using the values defined in the structured format above to fill in +placeholders as appropriate: + +> Connection pool for {{serverHost}}:{{serverPort}} cleared for serviceId {{serviceId}} + +#### Pool Closed Message + +In addition to the common fields defined above, this message MUST contain the following key-value pairs: + +| Key | Suggested Type | Value | +| ------- | -------------- | ------------------------ | +| message | String | "Connection pool closed" | + +The unstructured form SHOULD be as follows, using the values defined in the structured format above to fill in +placeholders as appropriate: + +> Connection pool closed for {{serverHost}}:{{serverPort}} + +#### Connection Created Message + +In addition to the common fields defined above, this message MUST contain the following key-value pairs: + +| Key | Suggested Type | Value | +| ------------------ | -------------- | ----------------------------------------------------------------------------------- | +| message | String | "Connection created" | +| driverConnectionId | Int64 | The driver-generated ID for the connection as defined in [Connection](#connection). | + +The unstructured form SHOULD be as follows, using the values defined in the structured format above to fill in +placeholders as appropriate: + +> Connection created: address={{serverHost}}:{{serverPort}}, driver-generated ID={{driverConnectionId}} + +#### Connection Ready Message + +In addition to the common fields defined above, this message MUST contain the following key-value pairs: + +| Key | Suggested Type | Value | +| ------------------ | -------------- | ----------------------------------------------------------------------------------- | +| message | String | "Connection ready" | +| driverConnectionId | Int64 | The driver-generated ID for the connection as defined in [Connection](#connection). | +| durationMS | Int64 | `ConnectionReadyEvent.duration` converted to milliseconds. | + +The unstructured form SHOULD be as follows, using the values defined in the structured format above to fill in +placeholders as appropriate: + +> Connection ready: address={{serverHost}}:{{serverPort}}, driver-generated ID={{driverConnectionId}}, established +> in={{durationMS}} ms + +#### Connection Closed Message + +In addition to the common fields defined above, this message MUST contain the following key-value pairs: + +| Key | Suggested Type | Value | +| ------------------ | -------------- | ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | +| message | String | "Connection closed" | +| driverConnectionId | Int64 | The driver-generated ID for the connection as defined in a [Connection](#connection). | +| reason | String | A string describing the reason the connection was closed. The following strings MUST be used for each possible reason as defined in [Events](#events) above:
- Stale: "Connection became stale because the pool was cleared
- Idle: "Connection has been available but unused for longer than the configured max idle time"
- Error: "An error occurred while using the connection"
- Pool closed: "Connection pool was closed" | +| error | Flexible | If `reason` is `Error`, the associated error.
The type and format of this value is flexible; see the [logging specification](../logging/logging.rst#representing-errors-in-log-messages) for details on representing errors in log messages. | + +The unstructured form SHOULD be as follows, using the values defined in the structured format above to fill in +placeholders as appropriate: + +> Connection closed: address={{serverHost}}:{{serverPort}}, driver-generated ID={{driverConnectionId}}. Reason: +> {{reason}}. Error: {{error}} + +#### Connection Checkout Started Message + +In addition to the common fields defined above, this message MUST contain the following key-value pairs: + +| Key | Suggested Type | Value | +| ------- | -------------- | ----------------------------- | +| message | String | "Connection checkout started" | + +The unstructured form SHOULD be as follows, using the values defined in the structured format above to fill in +placeholders as appropriate: + +> Checkout started for connection to {{serverHost}}:{{serverPort}} + +#### Connection Checkout Failed Message + +In addition to the common fields defined above, this message MUST contain the following key-value pairs: + +| Key | Suggested Type | Value | +| ---------- | -------------- | ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------ | +| message | String | "Connection checkout failed" | +| reason | String | A string describing the reason checkout. The following strings MUST be used for each possible reason as defined in [Events](#events) above:
- Timeout: "Wait queue timeout elapsed without a connection becoming available"
- ConnectionError: "An error occurred while trying to establish a new connection"
- Pool closed: "Connection pool was closed" | +| error | Flexible | If `reason` is `ConnectionError`, the associated error. The type and format of this value is flexible; see the [logging specification](../logging/logging.rst#representing-errors-in-log-messages) for details on representing errors in log messages. | +| durationMS | Int64 | `ConnectionCheckOutFailedEvent.duration` converted to milliseconds. | + +The unstructured form SHOULD be as follows, using the values defined in the structured format above to fill in +placeholders as appropriate: + +> Checkout failed for connection to {{serverHost}}:{{serverPort}}. Reason: {{reason}}. Error: {{error}}. Duration: +> {{durationMS}} ms + +#### Connection Checked Out + +In addition to the common fields defined above, this message MUST contain the following key-value pairs: + +| Key | Suggested Type | Value | +| ------------------ | -------------- | ----------------------------------------------------------------------------------- | +| message | String | "Connection checked out" | +| driverConnectionId | Int64 | The driver-generated ID for the connection as defined in [Connection](#connection). | +| durationMS | Int64 | `ConnectionCheckedOutEvent.duration` converted to milliseconds. | + +The unstructured form SHOULD be as follows, using the values defined in the structured format above to fill in +placeholders as appropriate: + +> Connection checked out: address={serverHost}}:{{serverPort}}, driver-generated ID={{driverConnectionId}}, +> duration={{durationMS}} ms + +#### Connection Checked In + +In addition to the common fields defined above, this message MUST contain the following key-value pairs: + +| Key | Suggested Type | Value | +| ------------------ | -------------- | ----------------------------------------------------------------------------------- | +| message | String | "Connection checked in" | +| driverConnectionId | Int64 | The driver-generated ID for the connection as defined in [Connection](#connection). | + +The unstructured form SHOULD be as follows, using the values defined in the structured format above to fill in +placeholders as appropriate: + +> Connection checked in: address={{serverHost}}:{{serverPort}}, driver-generated ID={{driverConnectionId}} + +### Connection Pool Errors + +A connection pool throws errors in specific circumstances. These Errors MUST be emitted by the pool. Errors SHOULD be +created and dispatched in a manner idiomatic to the Driver and Language. + +```typescript +/** + * Thrown when the driver attempts to check out a + * Connection from a closed Connection Pool + */ +interface PoolClosedError { + message: 'Attempted to check out a Connection from closed connection pool'; + address: ; +} + +/** + * Thrown when the driver attempts to check out a + * Connection from a paused Connection Pool + */ +interface PoolClearedError extends RetryableError { + message: 'Connection pool for was cleared because another operation failed with: '; + address: ; +} + +/** + * Thrown when a driver times out when attempting to check out + * a Connection from a Pool + */ +interface WaitQueueTimeoutError { + message: 'Timed out while checking out a Connection from connection pool'; + address: ; +} +``` + +## Test Plan + +See [tests/README](tests/README.md) + +## Design Rationale + +### Why do we set minPoolSize across all members of a replicaSet, when most traffic will be against a Primary? + +Currently, we are attempting to codify our current pooling behavior with minimal changes, and minPoolSize is currently +uniform across all members of a replicaSet. This has the benefit of offsetting connection swarming during a Primary +Step-Down, which will be further addressed in our [Advanced Pooling Behaviors](#advanced-pooling-behaviors). + +### Why do we have separate ConnectionCreated and ConnectionReady events, but only one ConnectionClosed event? + +ConnectionCreated and ConnectionReady each involve different state changes in the pool. + +- ConnectionCreated adds a new “pending” [Connection](#connection), meaning the totalConnectionCount and + pendingConnectionCount increase by one +- ConnectionReady establishes that the [Connection](#connection) is ready for use, meaning the availableConnectionCount + increases by one + +ConnectionClosed indicates that the [Connection](#connection) is no longer a member of the pool, decrementing +totalConnectionCount and potentially availableConnectionCount. After this point, the [Connection](#connection) is no +longer a part of the pool. Further hypothetical events would not indicate a change to the state of the pool, so they are +not specified here. + +### Why are waitQueueSize and waitQueueMultiple deprecated? + +These options were originally only implemented in three drivers (Java, C#, and Python), and provided little value. While +these fields would allow for faster diagnosis of issues in the connection pool, they would not actually prevent an error +from occurring. + +Additionally, these options have the effect of prioritizing older requests over newer requests, which is not necessarily +the behavior that users want. They can also result in cases where queue access oscillates back and forth between full +and not full. If a driver has a full waitQueue, then all requests for [Connections](#connection) will be rejected. If +the client is continually spammed with requests, you could wind up with a scenario where as soon as the waitQueue is no +longer full, it is immediately filled. It is not a favorable situation to be in, partially b/c it violates the fairness +guarantee that the waitQueue normally provides. + +Because of these issues, it does not make sense to +[go against driver mantras and provide an additional knob](../../README.md#). We may eventually pursue an alternative +configurations to address wait queue size in [Advanced Pooling Behaviors](#advanced-pooling-behaviors). + +Users that wish to have this functionality can achieve similar results by utilizing other methods to limit concurrency. +Examples include implementing either a thread pool or an operation queue with a capped size in the user application. +Drivers that need to deprecate `waitQueueSize` and/or `waitQueueMultiple` SHOULD refer users to these examples. + +### Why is waitQueueTimeoutMS optional for some drivers? + +We are anticipating eventually introducing a single client-side timeout mechanism, making us hesitant to introduce +another granular timeout control. Therefore, if a driver/language already has an idiomatic way to implement their +timeouts, they should leverage that mechanism over implementing waitQueueTimeoutMS. + +### Why must populating the pool require the use of a background thread or async I/O? + +Without the use of a background thread, the pool is +[populated](#populating-the-pool-with-a-connection-internal-implementation) with enough connections to satisfy +minPoolSize during checkOut. [Connections](#connection) are established as part of populating the pool though, so if +[Connection](#connection) establishment were done in a blocking fashion, the first operations after a clearing of the +pool would experience unacceptably high latency, especially for larger values of minPoolSize. Thus, populating the pool +must occur on a background thread (which is acceptable to block) or via the usage of non-blocking (async) I/O. + +### Why should closing a connection be non-blocking? + +Because idle and perished [Connections](#connection) are cleaned up as part of checkOut, performing blocking I/O while +closing such [Connections](#connection) would block application threads, introducing unnecessary latency. Once a +[Connection](#connection) is marked as "closed", it will not be checked out again, so ensuring the socket is torn down +does not need to happen immediately and can happen at a later time, either via async I/O or a background thread. + +### Why can the pool be paused? + +The distinction between the "paused" state and the "ready" state allows the pool to determine whether or not the +endpoint it is associated with is available or not. This enables the following behaviors: + +1. The pool can halt the creation of background connection establishments until the endpoint becomes available again. + Without the "paused" state, the pool would have no way of determining when to begin establishing background + connections again, so it would just continually attempt, and often fail, to create connections until minPoolSize was + satisfied, even after repeated failures. This could unnecessarily waste resources both server and driver side. +1. The pool can evict requests that enter the WaitQueue after the pool was cleared but before the server was in a known + state again. Such requests can occur when a server is selected at the same time as it becomes marked as Unknown in + highly concurrent workloads. Without the "paused" state, the pool would attempt to service these requests, since it + would assume they were routed to the pool because its endpoint was available, not because of a race between SDAM and + Server Selection. These requests would then likely fail with potentially high latency, again wasting resources both + server and driver side. + +### Why not emit PoolCleared events and log messages when clearing a paused pool? + +If a pool is already paused when it is cleared, that means it was previously cleared and no new connections have been +created since then. Thus, clearing the pool in this case is essentially a no-op, so there is no need to notify any +listeners that it has occurred. The generation is still incremented, however, to ensure future errors that caused the +duplicate clear will stop attempting to clear the pool again. This situation is possible if the pool is cleared by the +background thread after it encounters an error establishing a connection, but the ServerDescription for the endpoint was +not updated accordingly yet. + +### Why does the pool need to support interrupting in use connections as part of its clear logic? + +If a SDAM monitor has observed a network timeout, we assume that all connections including "in use" connections are no +longer healthy. In some cases connections will fail to detect the network timeout fast enough. For example, a server +request can hang at the OS level in TCP retry loop up for 17 minutes before failing. Therefore these connections MUST be +proactively interrupted in the case of a server monitor network timeout. Requesting an immediate background thread run +will speed up this process. + +### Why don't we configure TCP_USER_TIMEOUT? + +Ideally, a reasonable TCP_USER_TIMEOUT can help with detecting stale connections as an alternative to +`interruptInUseConnections` in Clear. Unfortunately this approach is platform dependent and not each driver allows +easily configuring it. For example, C# driver can configure this socket option on linux only with target frameworks +higher or equal to .net 5.0. On macOS, there is no straight equivalent for this option, it's possible that we can find +some equivalent configuration, but this configuration will also require target frameworks higher than or equal to .net +5.0. The advantage of using Background Thread to manage perished connections is that it will work regardless of +environment setup. + +## Backwards Compatibility + +As mentioned in [Deprecated Options](#deprecated-options), some drivers currently implement the options `waitQueueSize` +and/or `waitQueueMultiple`. These options will need to be deprecated and phased out of the drivers that have implemented +them. + +## Reference Implementations + +- JAVA (JAVA-3079) +- RUBY (RUBY-1560) + +## Future Development + +### SDAM + +This specification does not dictate how SDAM Monitoring connections are managed. SDAM specifies that “A monitor SHOULD +NOT use the client's regular Connection pool”. Some possible solutions for this include: + +- Having each Endpoint representation in the driver create and manage a separate dedicated [Connection](#connection) for + monitoring purposes +- Having each Endpoint representation in the driver maintain a separate pool of maxPoolSize 1 for monitoring purposes. +- Having each Pool maintain a dedicated [Connection](#connection) for monitoring purposes, with an API to expose that + Connection. + +### Advanced Pooling Behaviors + +This spec does not address all advanced pooling behaviors like predictive pooling or aggressive +[Connection](#connection) creation. Future work may address this. + +### Add support for OP_MSG exhaustAllowed + +Exhaust Cursors may require changes to how we close [Connections](#connection) in the future, specifically to add a way +to close and remove from its pool a [Connection](#connection) which has unread exhaust messages. + +## Changelog + +- 2024-01-23: Migrated from reStructuredText to Markdown. + +- 2019-06-06: Add "connectionError" as a valid reason for ConnectionCheckOutFailedEvent + +- 2020-09-03: Clarify Connection states and definition. Require the use of a\ + background thread and/or async I/O. Add + tests to ensure ConnectionReadyEvents are fired after ConnectionCreatedEvents. + +- 2020-09-24: Introduce maxConnecting requirement + +- 2020-12-17: Introduce "paused" and "ready" states. Clear WaitQueue on pool clear. + +- 2021-01-12: Clarify "clear" method behavior in load balancer mode. + +- 2021-01-19: Require that timeouts be applied per the client-side operations\ + timeout specification. + +- 2021-04-12: Adding in behaviour for load balancer mode. + +- 2021-06-02: Formalize the behavior of a [Background Thread](#background-thread). + +- 2021-11-08: Make maxConnecting configurable. + +- 2022-04-05: Preemptively cancel in progress operations when SDAM heartbeats timeout. + +- 2022-10-05: Remove spec front matter and reformat changelog. + +- 2022-10-14: Add connection pool log messages and associated tests. + +- 2023-04-17: Fix duplicate logging test description. + +- 2023-08-04: Add durations to connection pool events. + +- 2023-10-04: Commit to the currently specified requirements regarding durations in events. + +______________________________________________________________________ diff --git a/source/connection-monitoring-and-pooling/connection-monitoring-and-pooling.rst b/source/connection-monitoring-and-pooling/connection-monitoring-and-pooling.rst deleted file mode 100644 index 8db7909c1f..0000000000 --- a/source/connection-monitoring-and-pooling/connection-monitoring-and-pooling.rst +++ /dev/null @@ -1,1627 +0,0 @@ -================================= -Connection Monitoring and Pooling -================================= - -:Status: Accepted -:Minimum Server Version: N/A - -.. contents:: - -Abstract -======== - -Drivers currently support a variety of options that allow users to configure connection pooling behavior. Users are confused by drivers supporting different subsets of these options. Additionally, drivers implement their connection pools differently, making it difficult to design cross-driver pool functionality. By unifying and codifying pooling options and behavior across all drivers, we will increase user comprehension and code base maintainability. - -This specification does not apply to drivers that do not support multitasking. - -META -==== - -The keywords “MUST”, “MUST NOT”, “REQUIRED”, “SHALL”, “SHALL NOT”, “SHOULD”, “SHOULD NOT”, “RECOMMENDED”, “MAY”, and “OPTIONAL” in this document are to be interpreted as described in `RFC 2119 `_. - -Definitions -=========== - -Connection -~~~~~~~~~~~~~~ - -A Connection (when linked) refers to the ``Connection`` type defined in the -`Connection Pool Members`_ section of this specification. It does not refer to an actual TCP -connection to an Endpoint. A ``Connection`` will attempt to create and wrap such -a TCP connection over the course of its existence, but it is not equivalent to -one nor does it wrap an active one at all times. - -For the purposes of testing, a mocked ``Connection`` type could be used with the -pool that never actually creates a TCP connection or performs any I/O. - -Endpoint -~~~~~~~~ - -For convenience, an Endpoint refers to either a **mongod** or **mongos** instance. - -Thread -~~~~~~ - -For convenience, a Thread refers to: - -- A shared-address-space process (a.k.a. a thread) in multi-threaded drivers -- An Execution Frame / Continuation in asynchronous drivers -- A goroutine in Go - -Behavioral Description -====================== - -Which Drivers this applies to -~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ - -This specification is solely concerned with drivers that implement a connection pool. A driver SHOULD implement a connection pool, but is not required to. - -Connection Pool Options -~~~~~~~~~~~~~~~~~~~~~~~ - -All drivers that implement a connection pool MUST implement and conform to the same MongoClient options. There can be slight deviation in naming to make the options idiomatic to the driver language. - -Connection Pool Behaviors -~~~~~~~~~~~~~~~~~~~~~~~~~ - -All driver connection pools MUST provide an API that allows the driver to check out a connection, check in a connection back to the pool, and clear all connections in the pool. This API is for internal use only, and SHOULD NOT be documented as a public API. - -Connection Pool Monitoring -~~~~~~~~~~~~~~~~~~~~~~~~~~ - -All drivers that implement a connection pool MUST provide an API that allows users to subscribe to events emitted from the pool. -Conceptually, event emission is instantaneous, i.e., one may talk about the instant an event is emitted, -and represents the start of an activity of delivering the event to a subscribed user. - -Detailed Design -=============== - -.. _connection-pool-options-1: - -Connection Pool Options -~~~~~~~~~~~~~~~~~~~~~~~ - -Drivers that implement a Connection Pool MUST support the following ConnectionPoolOptions: - -.. code:: typescript - - interface ConnectionPoolOptions { - /** - * The maximum number of Connections that may be associated - * with a pool at a given time. This includes in use and - * available connections. - * If specified, MUST be an integer >= 0. - * A value of 0 means there is no limit. - * Defaults to 100. - */ - maxPoolSize?: number; - - /** - * The minimum number of Connections that MUST exist at any moment - * in a single connection pool. - * If specified, MUST be an integer >= 0. If maxPoolSize is > 0 - * then minPoolSize must be <= maxPoolSize - * Defaults to 0. - */ - minPoolSize?: number; - - /** - * The maximum amount of time a Connection should remain idle - * in the connection pool before being marked idle. - * If specified, MUST be a number >= 0. - * A value of 0 means there is no limit. - * Defaults to 0. - */ - maxIdleTimeMS?: number; - - /** - * The maximum number of Connections a Pool may be establishing concurrently. - * Establishment of a Connection is a part of its life cycle - * starting after a ConnectionCreatedEvent and ending before a ConnectionReadyEvent. - * If specified, MUST be a number > 0. - * Defaults to 2. - */ - maxConnecting?: number; - } - -Additionally, Drivers that implement a Connection Pool MUST support the following ConnectionPoolOptions UNLESS that driver meets ALL of the following conditions: - -- The driver/language currently has an idiomatic timeout mechanism implemented -- The timeout mechanism conforms to `the aggressive requirement of timing out a thread in the WaitQueue <#w1dcrm950sbn>`__ - -.. code:: typescript - - interface ConnectionPoolOptions { - /** - * NOTE: This option has been deprecated in favor of timeoutMS. - * - * The maximum amount of time a thread can wait for - * either an available non-perished connection (limited by `maxPoolSize`), - * or a pending connection (limited by `maxConnecting`). - * If specified, MUST be a number >= 0. - * A value of 0 means there is no limit. - * Defaults to 0. - */ - waitQueueTimeoutMS?: number; - } - -These options MUST be specified at the MongoClient level, and SHOULD be named in a manner idiomatic to the driver's language. All connection pools created by a MongoClient MUST use the same ConnectionPoolOptions. - -When parsing a mongodb connection string, a user MUST be able to specify these options using the default names specified above. - -Deprecated Options ------------------- - -The following ConnectionPoolOptions are considered deprecated. They MUST NOT be implemented if they do not already exist in a driver, and they SHOULD be deprecated and removed from drivers that implement them as early as possible: - -.. code:: typescript - - interface ConnectionPoolOptions { - /** - * The maximum number of threads that can simultaneously wait - * for a Connection to become available. - */ - waitQueueSize?: number; - - /** - * An alternative way of setting waitQueueSize, it specifies - * the maximum number of threads that can wait per connection. - * waitQueueSize === waitQueueMultiple \* maxPoolSize - */ - waitQueueMultiple?: number - } - -Connection Pool Members -~~~~~~~~~~~~~~~~~~~~~~~ - -Connection ----------- - -A driver-defined wrapper around a single TCP connection to an Endpoint. A `Connection`_ has the following properties: - -- **Single Endpoint:** A `Connection`_ MUST be associated with a single Endpoint. A `Connection`_ MUST NOT be associated with multiple Endpoints. -- **Single Lifetime:** A `Connection`_ MUST NOT be used after it is closed. -- **Single Owner:** A `Connection`_ MUST belong to exactly one Pool, and MUST NOT be shared across multiple pools -- **Single Track:** A `Connection`_ MUST limit itself to one request / response at a time. A `Connection`_ MUST NOT multiplex/pipeline requests to an Endpoint. -- **Monotonically Increasing ID:** A `Connection`_ MUST have an ID number associated with it. `Connection`_ IDs within a Pool MUST be assigned in order of creation, starting at 1 and increasing by 1 for each new Connection. -- **Valid Connection:** A connection MUST NOT be checked out of the pool until it has successfully and fully completed a MongoDB Handshake and Authentication as specified in the `Handshake `__, `OP_COMPRESSED `__, and `Authentication `__ specifications. -- **Perishable**: it is possible for a `Connection`_ to become **Perished**. A `Connection`_ is considered perished if any of the following are true: - - - **Stale:** The `Connection`_ 's generation does not match the generation of the parent pool - - **Idle:** The `Connection`_ is currently "available" (as defined below) and has been for longer than **maxIdleTimeMS**. - - **Errored:** The `Connection`_ has experienced an error that indicates it is no longer recommended for use. Examples include, but are not limited to: - - - Network Error - - Network Timeout - - Endpoint closing the connection - - Driver-Side Timeout - - Wire-Protocol Error - -.. code:: typescript - - interface Connection { - /** - * An id number associated with the Connection - */ - id: number; - - /** - * The address of the pool that owns this Connection - */ - address: string; - - /** - * An integer representing the “generation” of the pool - * when this Connection was created. - */ - generation: number; - - /** - * The current state of the Connection. - * - * Possible values are the following: - * - "pending": The Connection has been created but has not yet been established. Contributes to - * totalConnectionCount and pendingConnectionCount. - * - * - "available": The Connection has been established and is waiting in the pool to be checked - * out. Contributes to both totalConnectionCount and availableConnectionCount. - * - * - "in use": The Connection has been established, checked out from the pool, and has yet - * to be checked back in. Contributes to totalConnectionCount. - * - * - "closed": The Connection has had its socket closed and cannot be used for any future - * operations. Does not contribute to any connection counts. - * - * Note: this field is mainly used for the purposes of describing state - * in this specification. It is not required that drivers - * actually include this field in their implementations of Connection. - */ - state: "pending" | "available" | "in use" | "closed"; - } - -WaitQueue ---------- - -A concept that represents pending requests for `Connections <#connection>`_. When a thread requests a `Connection <#connection>`_ from a Pool, the thread enters the Pool's WaitQueue. A thread stays in the WaitQueue until it either receives a `Connection <#connection>`_ or times out. A WaitQueue has the following traits: - -- **Thread-Safe**: When multiple threads attempt to enter or exit a WaitQueue, they do so in a thread-safe manner. -- **Ordered/Fair**: When `Connections <#connection>`_ are made available, they are issued out to threads in the order that the threads entered the WaitQueue. -- **Timeout aggressively:** Members of a WaitQueue MUST timeout if they are enqueued for longer than the computed timeout and MUST leave the WaitQueue immediately in this case. - -The implementation details of a WaitQueue are left to the driver. -Example implementations include: - -- A fair Semaphore -- A Queue of callbacks - -Connection Pool ---------------- - -A driver-defined entity that encapsulates all non-monitoring -`Connections <#connection>`_ associated with a single Endpoint. The pool -has the following properties: - -- **Thread Safe:** All Pool behaviors MUST be thread safe. -- **Not Fork-Safe:** A Pool is explicitly not fork-safe. If a Pool detects that is it being used by a forked process, it MUST immediately clear itself and update its pid -- **Single Owner:** A Pool MUST be associated with exactly one Endpoint, and MUST NOT be shared between Endpoints. -- **Emit Events and Log Messages:** A Pool MUST emit pool events and log messages when dictated by this spec (see `Connection Pool Monitoring <#connection-pool-monitoring>`__). Users MUST be able to subscribe to emitted events and log messages in a manner idiomatic to their language and driver. -- **Closeable:** A Pool MUST be able to be manually closed. When a Pool is closed, the following behaviors change: - - - Checking in a `Connection <#connection>`_ to the Pool automatically closes the `Connection <#connection>`_ - - Attempting to check out a `Connection <#connection>`_ from the Pool results in an Error - -- **Clearable:** A Pool MUST be able to be cleared. Clearing the pool marks all pooled and checked out `Connections <#connection>`_ as stale and lazily closes them as they are checkedIn or encountered in checkOut. Additionally, all requests are evicted from the WaitQueue and return errors that are considered non-timeout network errors. - -- **Pausable:** A Pool MUST be able to be paused and resumed. A Pool is paused automatically when it is cleared, and it can be resumed by being marked as "ready". While the Pool is paused, it exhibits the following behaviors: - - - Attempting to check out a `Connection <#connection>`_ from the Pool results in a non-timeout network error - - Connections are not created in the background to satisfy minPoolSize - -- **Capped:** a pool is capped if **maxPoolSize** is set to a non-zero value. If a pool is capped, then its total number of `Connections <#connection>`_ (including available and in use) MUST NOT exceed **maxPoolSize** -- **Rate-limited:** A Pool MUST limit the number of `Connections <#connection>`_ being `established <#establishing-a-connection-internal-implementation>`_ concurrently via the **maxConnecting** `pool option <#connection-pool-options-1>`_. - - -.. code:: typescript - - interface ConnectionPool { - /** - * The Queue of threads waiting for a Connection to be available - */ - waitQueue: WaitQueue; - - /** - * A generation number representing the SDAM generation of the pool. - */ - generation: number; - - /** - * A map representing the various generation numbers for various services - * when in load balancer mode. - */ - serviceGenerations: Map; - - /** - * The state of the pool. - * - * Possible values are the following: - * - "paused": The initial state of the pool. Connections may not be checked out nor can they - * be established in the background to satisfy minPoolSize. Clearing a pool - * transitions it to this state. - * - * - "ready": The healthy state of the pool. It can service checkOut requests and create - * connections in the background. The pool can be set to this state via the - * ready() method. - * - * - "closed": The pool is destroyed. No more Connections may ever be checked out nor any - * created in the background. The pool can be set to this state via the close() - * method. The pool cannot transition to any other state after being closed. - */ - state: "paused" | "ready" | "closed"; - - // Any of the following connection counts may be computed rather than - // actually stored on the pool. - - /** - * An integer expressing how many total Connections - * ("pending" + "available" + "in use") the pool currently has - */ - totalConnectionCount: number; - - /** - * An integer expressing how many Connections are currently - * available in the pool. - */ - availableConnectionCount: number; - - /** - * An integer expressing how many Connections are currently - * being established. - */ - pendingConnectionCount: number; - - /** - * Returns a Connection for use - */ - checkOut(): Connection; - - /** - * Check in a Connection back to the Connection pool - */ - checkIn(connection: Connection): void; - - /** - * Mark all current Connections as stale, clear the WaitQueue, and mark the pool as "paused". - * No connections may be checked out or created in this pool until ready() is called again. - * interruptInUseConnections specifies whether the pool will force interrupt "in use" connections as part of the clear. - * Default false. - */ - clear(interruptInUseConnections: Optional): void; - - /** - * Mark the pool as "ready", allowing checkOuts to resume and connections to be created in the background. - * A pool can only transition from "paused" to "ready". A "closed" pool - * cannot be marked as "ready" via this method. - */ - ready(): void; - - /** - * Marks the pool as "closed", preventing the pool from creating and returning new Connections - */ - close(): void; - } - -.. _connection-pool-behaviors-1: - -Connection Pool Behaviors -~~~~~~~~~~~~~~~~~~~~~~~~~ - -Creating a Connection Pool --------------------------- - -This specification does not define how a pool is to be created, leaving it -up to the driver. Creation of a connection pool is generally an implementation -detail of the driver, i.e., is not a part of the public API of the driver. -The SDAM specification defines `when -`_ -the driver should create connection pools. - -When a pool is created, its state MUST initially be set to "paused". Even if -minPoolSize is set, the pool MUST NOT begin being `populated -<#populating-the-pool-with-a-connection-internal-implementation>`_ with -`Connections <#connection>`_ until it has been marked as "ready". SDAM will mark -the pool as "ready" on each successful check. See `Connection Pool Management`_ -section in the SDAM specification for more information. - -.. code:: - - set generation to 0 - set state to "paused" - emit PoolCreatedEvent and equivalent log message - -Closing a Connection Pool -------------------------- - -When a pool is closed, it MUST first close all available `Connections <#connection>`_ in that pool. This results in the following behavior changes: - -- In use `Connections <#connection>`_ MUST be closed when they are checked in to the closed pool. -- Attempting to check out a `Connection <#connection>`_ MUST result in an error. - -.. code:: - - mark pool as "closed" - for connection in availableConnections: - close connection - emit PoolClosedEvent and equivalent log message - -Marking a Connection Pool as Ready ----------------------------------- - -Connection Pools start off as "paused", and they are marked as "ready" by -monitors after they perform successful server checks. Once a pool is "ready", -it can start checking out `Connections <#connection>`_ and populating them in -the background. - -If the pool is already "ready" when this method is invoked, then this -method MUST immediately return and MUST NOT emit a PoolReadyEvent. - -.. code:: - - mark pool as "ready" - emit PoolReadyEvent and equivalent log message - allow background thread to create connections - -Note that the PoolReadyEvent MUST be emitted before the background thread is allowed to resume creating new connections, -and it must be the case that no observer is able to observe actions of the background thread -related to creating new connections before observing the PoolReadyEvent event. - -Creating a Connection (Internal Implementation) ------------------------------------------------ - -When creating a `Connection <#connection>`_, the initial `Connection <#connection>`_ is in a -“pending” state. This only creates a “virtual” `Connection <#connection>`_, and -performs no I/O. - -.. code:: - - connection = new Connection() - increment totalConnectionCount - increment pendingConnectionCount - set connection state to "pending" - tConnectionCreated = current instant (use a monotonic clock if possible) - emit ConnectionCreatedEvent and equivalent log message - return connection - -Establishing a Connection (Internal Implementation) ---------------------------------------------------- - -Before a `Connection <#connection>`_ can be marked as either "available" or "in use", it -must be established. This process involves performing the initial -handshake, handling OP_COMPRESSED, and performing authentication. - -.. code:: - - try: - connect connection via TCP / TLS - perform connection handshake - handle OP_COMPRESSED - perform connection authentication - tConnectionReady = current instant (use a monotonic clock if possible) - emit ConnectionReadyEvent(duration = tConnectionReady - tConnectionCreated) and equivalent log message - return connection - except error: - close connection - throw error # Propagate error in manner idiomatic to language. - - -Closing a Connection (Internal Implementation) ----------------------------------------------- - -When a `Connection <#connection>`_ is closed, it MUST first be marked as "closed", -removing it from being counted as "available" or "in use". Once that is -complete, the `Connection <#connection>`_ can perform whatever teardown is -necessary to close its underlying socket. The Driver SHOULD perform this -teardown in a non-blocking manner, such as via the use of a background -thread or async I/O. - -.. code:: - - original state = connection state - set connection state to "closed" - - if original state is "available": - decrement availableConnectionCount - else if original state is "pending": - decrement pendingConnectionCount - - decrement totalConnectionCount - emit ConnectionClosedEvent and equivalent log message - - # The following can happen at a later time (i.e. in background - # thread) or via non-blocking I/O. - connection.socket.close() - -Marking a Connection as Available (Internal Implementation) ------------------------------------------------------------ - -A `Connection <#connection>`_ is "available" if it is able to be checked out. A -`Connection <#connection>`_ MUST NOT be marked as "available" until it has been -established. The pool MUST keep track of the number of currently -available `Connections <#connection>`_. - -.. code:: - - increment availableConnectionCount - set connection state to "available" - add connection to availableConnections - - -Populating the Pool with a Connection (Internal Implementation) ---------------------------------------------------------------- - -"Populating" the pool involves preemptively creating and establishing a -`Connection <#connection>`_ which is marked as "available" for use in future -operations. - -Populating the pool MUST NOT block any application threads. For example, it -could be performed on a background thread or via the use of non-blocking/async -I/O. Populating the pool MUST NOT be performed unless the pool is "ready". - -If an error is encountered while populating a connection, it MUST be handled via -the SDAM machinery according to the `Application Errors`_ section in the SDAM -specification. - -If minPoolSize is set, the `Connection <#connection>`_ Pool MUST be populated -until it has at least minPoolSize total `Connections <#connection>`_. This MUST -occur only while the pool is "ready". If the pool implements a background -thread, it can be used for this. If the pool does not implement a background -thread, the checkOut method is responsible for ensuring this requirement is met. - -When populating the Pool, pendingConnectionCount has to be decremented after -establishing a `Connection`_ similarly to how it is done in -`Checking Out a Connection <#checking-out-a-connection>`_ to signal that -another `Connection`_ is allowed to be established. Such a signal MUST become -observable to any `Thread`_ after the action that -`marks the established Connection as "available" <#marking-a-connection-as-available-internal-implementation>`_ -becomes observable to the `Thread`_. -Informally, this order guarantees that no `Thread`_ tries to start -establishing a `Connection`_ when there is an "available" `Connection`_ -established as a result of populating the Pool. - -.. code:: - - wait until pendingConnectionCount < maxConnecting and pool is "ready" - create connection - try: - establish connection - mark connection as available - except error: - # Defer error handling to SDAM. - topology.handle_pre_handshake_error(error) - -Checking Out a Connection -------------------------- - -A Pool MUST have a method that allows the driver to check out a `Connection`_. -Checking out a `Connection`_ involves submitting a request to the WaitQueue and, -once that request reaches the front of the queue, having the Pool find or create -a `Connection`_ to fulfill that request. Requests MUST be subject to a timeout -which is computed per the rules in -`Client Side Operations Timeout: Server Selection -<../client-side-operations-timeout/client-side-operations-timeout.md#server-selection>`_. - -To service a request for a `Connection`_, the Pool MUST first iterate over the -list of available `Connections <#connection>`_, searching for a non-perished one -to be returned. If a perished `Connection`_ is encountered, such a `Connection`_ -MUST be closed (as described in `Closing a Connection -<#closing-a-connection-internal-implementation>`_) and the iteration of -available `Connections <#connection>`_ MUST continue until either a non-perished -available `Connection`_ is found or the list of available `Connections -<#connection>`_ is exhausted. - -If the list is exhausted, the total number of `Connections <#connection>`_ is -less than maxPoolSize, and pendingConnectionCount < maxConnecting, the pool MUST -create a `Connection`_, establish it, mark it as "in use" and return it. If -totalConnectionCount == maxPoolSize or pendingConnectionCount == maxConnecting, -then the pool MUST wait to service the request until neither of those two -conditions are met or until a `Connection`_ becomes available, re-entering the -checkOut loop in either case. This waiting MUST NOT prevent `Connections -<#connection>`_ from being checked into the pool. Additionally, the Pool MUST -NOT service any newer checkOut requests before fulfilling the original one which -could not be fulfilled. For drivers that implement the WaitQueue via a fair -semaphore, a condition variable may also be needed to to meet this -requirement. Waiting on the condition variable SHOULD also be limited by the -WaitQueueTimeout, if the driver supports one and it was specified by the user. - -If the pool is "closed" or "paused", any attempt to check out a `Connection -<#connection>`_ MUST throw an Error. The error thrown as a result of the pool -being "paused" MUST be considered a retryable error and MUST NOT be an error -that marks the SDAM state unknown. - -If the pool does not implement a background thread, the checkOut method is -responsible for ensuring that the pool is `populated -<#populating-the-pool-with-a-connection-internal-implementation>`_ with at least minPoolSize -`Connections <#connection>`_. - -A `Connection <#connection>`_ MUST NOT be checked out until it is -established. In addition, the Pool MUST NOT prevent other threads from checking -out `Connections <#connection>`_ while establishing a `Connection -<#connection>`_. - -Before a given `Connection <#connection>`_ is returned from checkOut, it must be marked as -"in use", and the pool's availableConnectionCount MUST be decremented. - -.. code:: - - connection = Null - tConnectionCheckOutStarted = current instant (use a monotonic clock if possible) - emit ConnectionCheckOutStartedEvent and equivalent log message - try: - enter WaitQueue - wait until at top of wait queue - # Note that in a lock-based implementation of the wait queue would - # only allow one thread in the following block at a time - while connection is Null: - if a connection is available: - while connection is Null and a connection is available: - connection = next available connection - if connection is perished: - close connection - connection = Null - else if totalConnectionCount < maxPoolSize: - if pendingConnectionCount < maxConnecting: - connection = create connection - else: - # this waiting MUST NOT prevent other threads from checking Connections - # back in to the pool. - wait until pendingConnectionCount < maxConnecting or a connection is available - continue - - except pool is "closed": - tConnectionCheckOutFailed = current instant (use a monotonic clock if possible) - emit ConnectionCheckOutFailedEvent(reason="poolClosed", duration = tConnectionCheckOutFailed - tConnectionCheckOutStarted) and equivalent log message - throw PoolClosedError - except pool is "paused": - tConnectionCheckOutFailed = current instant (use a monotonic clock if possible) - emit ConnectionCheckOutFailedEvent(reason="connectionError", duration = tConnectionCheckOutFailed - tConnectionCheckOutStarted) and equivalent log message - throw PoolClearedError - except timeout: - tConnectionCheckOutFailed = current instant (use a monotonic clock if possible) - emit ConnectionCheckOutFailedEvent(reason="timeout", duration = tConnectionCheckOutFailed - tConnectionCheckOutStarted) and equivalent log message - throw WaitQueueTimeoutError - finally: - # This must be done in all drivers - leave wait queue - - # If the Connection has not been established yet (TCP, TLS, - # handshake, compression, and auth), it must be established - # before it is returned. - # This MUST NOT block other threads from acquiring connections. - if connection state is "pending": - try: - establish connection - except connection establishment error: - tConnectionCheckOutFailed = current instant (use a monotonic clock if possible) - emit ConnectionCheckOutFailedEvent(reason="connectionError", duration = tConnectionCheckOutFailed - tConnectionCheckOutStarted) and equivalent log message - decrement totalConnectionCount - throw - finally: - decrement pendingConnectionCount - else: - decrement availableConnectionCount - set connection state to "in use" - - # If there is no background thread, the pool MUST ensure that - # there are at least minPoolSize total connections. - do asynchronously: - while totalConnectionCount < minPoolSize: - populate the pool with a connection - - tConnectionCheckedOut = current instant (use a monotonic clock if possible) - emit ConnectionCheckedOutEvent(duration = tConnectionCheckedOut - tConnectionCheckOutStarted) and equivalent log message - return connection - -Checking In a Connection ------------------------- - -A Pool MUST have a method of allowing the driver to check in a -`Connection <#connection>`_. The driver MUST NOT be allowed to check in a -`Connection <#connection>`_ to a Pool that did not create that `Connection <#connection>`_, and -MUST throw an Error if this is attempted. - -When the `Connection <#connection>`_ is checked in, it MUST be `closed -<#closing-a-connection-internal-implementation>`_ if any of the following are -true: - -- The `Connection <#connection>`_ is perished. -- The pool has been closed. - -Otherwise, the `Connection <#connection>`_ is marked as available. - -.. code:: - - emit ConnectionCheckedInEvent and equivalent log message - if connection is perished OR pool is closed: - close connection - else: - mark connection as available - -Clearing a Connection Pool --------------------------- - -Clearing the pool involves different steps depending on whether the pool is in -load balanced mode or not. The traditional / non-load balanced clearing behavior -MUST NOT be used by pools in load balanced mode, and the load balanced pool -clearing behavior MUST NOT be used in non-load balanced pools. - -Clearing a non-load balanced pool -^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ - -A Pool MUST have a method of clearing all `Connections <#connection>`_ when -instructed. Rather than iterating through every `Connection <#connection>`_, -this method should simply increment the generation of the Pool, implicitly -marking all current `Connections <#connection>`_ as stale. It should also -transition the pool's state to "paused" to halt the creation of new connections -until it is marked as "ready" again. The checkOut and checkIn algorithms will -handle clearing out stale `Connections <#connection>`_. If a user is subscribed -to Connection Monitoring events and/or connection log messages, a PoolClearedEvent -and log message MUST be emitted after incrementing the generation / marking the pool -as "paused". If the pool is already "paused" when it is cleared, then the pool MUST -NOT emit a PoolCleared event or log message. - -As part of clearing the pool, the WaitQueue MUST also be cleared, meaning all -requests in the WaitQueue MUST fail with errors indicating that the pool was -cleared while the checkOut was being performed. The error returned as a result -of the pool being cleared MUST be considered a retryable error and MUST NOT be -an error that marks the SDAM state unknown. Clearing the WaitQueue MUST happen -eagerly so that any operations waiting on `Connections <#connection>`_ can retry -as soon as possible. The pool MUST NOT rely on WaitQueueTimeoutMS to clear -requests from the WaitQueue. - -The clearing method MUST provide the option to interrupt any in-use connections as part -of the clearing (henceforth referred to as the interruptInUseConnections flag in this -specification). "Interrupting a Connection" is defined as canceling whatever task the -Connection is currently performing and marking the Connection as perished (e.g. by closing -its underlying socket). The interrupting of these Connections MUST be performed as soon as possible -but MUST NOT block the pool or prevent it from processing further requests. If the pool has a background -thread, and it is responsible for interrupting in-use connections, its next run MUST be scheduled as soon as -possible. - -The pool MUST only interrupt in-use Connections whose generation is less than or equal -to the generation of the pool at the moment of the clear (before the increment) -that used the interruptInUseConnections flag. Any operations that have their Connections -interrupted in this way MUST fail with a retryable error. If possible, the error SHOULD -be a PoolClearedError with the following message: "Connection to interrupted -due to server monitor timeout". - -Clearing a load balanced pool -^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ - -A Pool MUST also have a method of clearing all `Connections <#connection>`_ for -a specific ``serviceId`` for use when in load balancer mode. This method -increments the generation of the pool for that specific ``serviceId`` in the -generation map. A PoolClearedEvent and log message MUST be emitted after incrementing the -generation. Note that this method MUST NOT transition the pool to the "paused" -state and MUST NOT clear the WaitQueue. - -Load Balancer Mode ------------------- - -For load-balanced deployments, pools MUST maintain a map from ``serviceId`` to a -tuple of (generation, connection count) where the connection count refers to the -total number of connections that exist for a specific ``serviceId``. The pool MUST -remove the entry for a ``serviceId`` once the connection count reaches 0. -Once the MongoDB handshake is done, the connection MUST get the -generation number that applies to its ``serviceId`` from the map and update the -map to increment the connection count for this ``serviceId``. - -See the `Load Balancer Specification <../load-balancers/load-balancers.rst#connection-pooling>`__ for details. - - -Forking -------- - -A `Connection <#connection>`_ is explicitly not fork-safe. The proper behavior in the case of a fork is to ResetAfterFork by: - -- clear all Connection Pools in the child process -- closing all `Connections <#connection>`_ in the child-process. - -Drivers that support forking MUST document that `Connections <#connection>`_ to an Endpoint are not fork-safe, and document the proper way to ResetAfterFork in the driver. - -Drivers MAY aggressively ResetAfterFork if the driver detects it has been forked. - -Optional Behaviors ------------------- - -The following features of a Connection Pool SHOULD be implemented if they make sense in the driver and driver's language. - -Background Thread -^^^^^^^^^^^^^^^^^ - -A Pool SHOULD have a background Thread that is responsible for -monitoring the state of all available `Connections <#connection>`_. This background -thread SHOULD - -- Populate `Connections <#connection>`_ to ensure that the pool always satisfies minPoolSize. -- Remove and close perished available `Connections <#connection>`_ including "in use" connections if `interruptInUseConnections` option was set to true in the most recent pool clear. -- Apply timeouts to connection establishment per `Client Side Operations - Timeout: Background Connection Pooling - <../client-side-operations-timeout/client-side-operations-timeout.md#background-connection-pooling>`__. - -A pool SHOULD allow immediate scheduling of the next background thread iteration after a clear is performed. - -Conceptually, the aforementioned activities are organized into sequential Background Thread Runs. -A Run MUST do as much work as readily available and then end instead of waiting for more work. -For example, instead of waiting for pendingConnectionCount to become less than maxConnecting when satisfying minPoolSize, -a Run MUST either proceed with the rest of its duties, e.g., closing available perished connections, or end. - -The duration of intervals between the end of one Run and the beginning of the next Run is not specified, -but the -`Test Format and Runner Specification `__ -may restrict this duration, or introduce other restrictions to facilitate testing. - -withConnection -^^^^^^^^^^^^^^ - -A Pool SHOULD implement a scoped resource management mechanism idiomatic to their language to prevent `Connections <#connection>`_ from not being checked in. Examples include `Python's "with" statement `__ and `C#'s "using" statement `__. If implemented, drivers SHOULD use this method as the default method of checking out and checking in `Connections <#connection>`_. - -.. _connection-pool-monitoring-1: - -Connection Pool Monitoring -~~~~~~~~~~~~~~~~~~~~~~~~~~ - -All drivers that implement a connection pool MUST provide an API that allows users to subscribe to events emitted from the pool. If a user subscribes to Connection Monitoring events, these events MUST be emitted when specified in “Connection Pool Behaviors”. Events SHOULD be created and subscribed to in a manner idiomatic to their language and driver. - -Events ------- - -See the `Load Balancer Specification <../load-balancers/load-balancers.rst#events>`__ for details on the ``serviceId`` field. - -.. code:: typescript - - /** - * Emitted when a Connection Pool is created - */ - interface PoolCreatedEvent { - /** - * The ServerAddress of the Endpoint the pool is attempting to connect to. - */ - address: string; - - /** - * Any non-default pool options that were set on this Connection Pool. - */ - options: {...} - } - - /** - * Emitted when a Connection Pool is marked as ready. - */ - interface PoolReadyEvent { - /** - * The ServerAddress of the Endpoint the pool is attempting to connect to. - */ - address: string; - } - - /** - * Emitted when a Connection Pool is cleared - */ - interface PoolClearedEvent { - /** - * The ServerAddress of the Endpoint the pool is attempting to connect to. - */ - address: string; - - /** - * The service id for which the pool was cleared for in load balancing mode. - * See load balancer specification for more information about this field. - */ - serviceId: Optional; - - /** - * A flag whether the pool forced interrupting "in use" connections as part of the clear. - */ - interruptInUseConnections: Optional; - } - - /** - * Emitted when a Connection Pool is closed - */ - interface PoolClosedEvent { - /** - * The ServerAddress of the Endpoint the pool is attempting to connect to. - */ - address: string; - } - - /** - * Emitted when a Connection Pool creates a Connection object. - * NOTE: This does not mean that the Connection is ready for use. - */ - interface ConnectionCreatedEvent { - /** - * The ServerAddress of the Endpoint the pool is attempting to connect to. - */ - address: string; - - /** - * The ID of the Connection - */ - connectionId: int64; - } - - /** - * Emitted when a Connection has finished its setup, and is now ready to use - */ - interface ConnectionReadyEvent { - /** - * The ServerAddress of the Endpoint the pool is attempting to connect to. - */ - address: string; - - /** - * The ID of the Connection - */ - connectionId: int64; - - /** - * The time it took to establish the connection. - * In accordance with the definition of establishment of a connection - * specified by `ConnectionPoolOptions.maxConnecting`, - * it is the time elapsed between emitting a `ConnectionCreatedEvent` - * and emitting this event as part of the same checking out. - * - * Naturally, when establishing a connection is part of checking out, - * this duration is not greater than - * `ConnectionCheckedOutEvent`/`ConnectionCheckOutFailedEvent.duration`. - * - * A driver MAY choose the type idiomatic to the driver. - * If the type chosen does not convey units, e.g., `int64`, - * then the driver MAY include units in the name, e.g., `durationMS`. - */ - duration: Duration; - } - - /** - * Emitted when a Connection Pool closes a Connection - */ - interface ConnectionClosedEvent { - /** - * The ServerAddress of the Endpoint the pool is attempting to connect to. - */ - address: string; - - /** - * The ID of the Connection - */ - connectionId: int64; - - /** - * A reason explaining why this Connection was closed. - * Can be implemented as a string or enum. - * Current valid values are: - * - "stale": The pool was cleared, making the Connection no longer valid - * - "idle": The Connection became stale by being available for too long - * - "error": The Connection experienced an error, making it no longer valid - * - "poolClosed": The pool was closed, making the Connection no longer valid - */ - reason: string|Enum; - } - - /** - * Emitted when the driver starts attempting to check out a Connection - */ - interface ConnectionCheckOutStartedEvent { - /** - * The ServerAddress of the Endpoint the pool is attempting - * to connect to. - */ - address: string; - } - - /** - * Emitted when the driver's attempt to check out a Connection fails - */ - interface ConnectionCheckOutFailedEvent { - /** - * The ServerAddress of the Endpoint the pool is attempting to connect to. - */ - address: string; - - /** - * A reason explaining why Connection check out failed. - * Can be implemented as a string or enum. - * Current valid values are: - * - "poolClosed": The pool was previously closed, and cannot provide new Connections - * - "timeout": The Connection check out attempt exceeded the specified timeout - * - "connectionError": The Connection check out attempt experienced an error while setting up a new Connection - */ - reason: string|Enum; - - /** - * See `ConnectionCheckedOutEvent.duration`. - */ - duration: Duration; - } - - /** - * Emitted when the driver successfully checks out a Connection - */ - interface ConnectionCheckedOutEvent { - /** - * The ServerAddress of the Endpoint the pool is attempting to connect to. - */ - address: string; - - /** - * The ID of the Connection - */ - connectionId: int64; - - /** - * The time it took to check out the connection. - * More specifically, the time elapsed between - * emitting a `ConnectionCheckOutStartedEvent` - * and emitting this event as part of the same checking out. - * - * Naturally, if a new connection was not created (`ConnectionCreatedEvent`) - * and established (`ConnectionReadyEvent`) as part of checking out, - * this duration is usually - * not greater than `ConnectionPoolOptions.waitQueueTimeoutMS`, - * but MAY occasionally be greater than that, - * because a driver does not provide hard real-time guarantees. - * - * A driver MAY choose the type idiomatic to the driver. - * If the type chosen does not convey units, e.g., `int64`, - * then the driver MAY include units in the name, e.g., `durationMS`. - */ - duration: Duration; - } - - /** - * Emitted when the driver checks in a Connection back to the Connection Pool - */ - interface ConnectionCheckedInEvent { - /** - * The ServerAddress of the Endpoint the pool is attempting to connect to. - */ - address: string; - - /** - * The ID of the Connection - */ - connectionId: int64; - } - -Connection Pool Logging -~~~~~~~~~~~~~~~~~~~~~~~ -Please refer to the `logging specification <../logging/logging.rst>`__ for details on logging implementations in general, including log levels, log -components, handling of null values in log messages, and structured versus unstructured logging. - -Drivers MUST support logging of connection pool information via the following types of log messages. These messages MUST be logged at ``Debug`` level -and use the ``connection`` log component. These messages MUST be emitted when specified in “Connection Pool Behaviors”. - -The log messages are intended to match the information contained in the events above. Drivers MAY implement connection logging support via an event -subscriber if it is convenient to do so. - -The types used in the structured message definitions below are demonstrative, and drivers MAY use similar types instead so long as the information -is present (e.g. a double instead of an integer, or a string instead of an integer if the structured logging framework does not support numeric types). - -Common Fields -------------- -All connection log messages MUST contain the following key-value pairs: - -.. list-table:: - :header-rows: 1 - :widths: 1 1 1 - - * - Key - - Suggested Type - - Value - - * - serverHost - - String - - the hostname, IP address, or Unix domain socket path for the endpoint the pool is for. - - * - serverPort - - Int - - The port for the endpoint the pool is for. Optional; not present for Unix domain sockets. When - the user does not specify a port and the default (27017) is used, the driver SHOULD include it here. - -Pool Created Message ---------------------- -In addition to the common fields defined above, this message MUST contain the following key-value pairs: - -.. list-table:: - :header-rows: 1 - :widths: 1 1 1 - - * - Key - - Suggested Type - - Value - - * - message - - String - - "Connection pool created" - - * - maxIdleTimeMS - - Int - - The maxIdleTimeMS value for this pool. Optional; only required to include if the user specified a value. - - * - minPoolSize - - Int - - The minPoolSize value for this pool. Optional; only required to include if the user specified a value. - - * - maxPoolSize - - Int - - The maxPoolSize value for this pool. Optional; only required to include if the user specified a value. - - * - maxConnecting - - Int - - The maxConnecting value for this pool. Optional; only required to include if the driver supports this option and the user - specified a value. - - * - waitQueueTimeoutMS - - Int - - The waitQueueTimeoutMS value for this pool. Optional; only required to include if the driver supports this option and the - user specified a value. - - * - waitQueueSize - - Int - - The waitQueueSize value for this pool. Optional; only required to include if the driver supports this option and the - user specified a value. - - * - waitQueueMultiple - - Int - - The waitQueueMultiple value for this pool. Optional; only required to include if the driver supports this option and the - user specified a value. - -The unstructured form SHOULD be as follows, using the values defined in the structured format above to fill in placeholders as appropriate: - - Connection pool created for {{serverHost}}:{{serverPort}} using options maxIdleTimeMS={{maxIdleTimeMS}}, - minPoolSize={{minPoolSize}}, maxPoolSize={{maxPoolSize}}, maxConnecting={{maxConnecting}}, waitQueueTimeoutMS={{waitQueueTimeoutMS}}, - waitQueueSize={{waitQueueSize}}, waitQueueMultiple={{waitQueueMultiple}} - -Pool Ready Message ------------------- -In addition to the common fields defined above, this message MUST contain the following key-value pairs: - -.. list-table:: - :header-rows: 1 - :widths: 1 1 1 - - * - Key - - Suggested Type - - Value - - * - message - - String - - "Connection pool ready" - -The unstructured form SHOULD be as follows, using the values defined in the structured format above to fill in placeholders as appropriate: - - Connection pool ready for {{serverHost}}:{{serverPort}} - -Pool Cleared Message --------------------- -In addition to the common fields defined above, this message MUST contain the following key-value pairs: - -.. list-table:: - :header-rows: 1 - :widths: 1 1 1 - - * - Key - - Suggested Type - - Value - - * - message - - String - - "Connection pool cleared" - - * - serviceId - - String - - The hex string representation of the service ID which the pool was cleared for. Optional; only present in load balanced mode. - -The unstructured form SHOULD be as follows, using the values defined in the structured format above to fill in placeholders as appropriate: - - Connection pool for {{serverHost}}:{{serverPort}} cleared for serviceId {{serviceId}} - -Pool Closed Message -------------------- -In addition to the common fields defined above, this message MUST contain the following key-value pairs: - -.. list-table:: - :header-rows: 1 - :widths: 1 1 1 - - * - Key - - Suggested Type - - Value - - * - message - - String - - "Connection pool closed" - -The unstructured form SHOULD be as follows, using the values defined in the structured format above to fill in placeholders as appropriate: - - Connection pool closed for {{serverHost}}:{{serverPort}} - -Connection Created Message --------------------------- -In addition to the common fields defined above, this message MUST contain the following key-value pairs: - -.. list-table:: - :header-rows: 1 - :widths: 1 1 1 - - * - Key - - Suggested Type - - Value - - * - message - - String - - "Connection created" - - * - driverConnectionId - - Int64 - - The driver-generated ID for the connection as defined in `Connection <#connection>`_. - -The unstructured form SHOULD be as follows, using the values defined in the structured format above to fill in placeholders as appropriate: - - Connection created: address={{serverHost}}:{{serverPort}}, driver-generated ID={{driverConnectionId}} - -Connection Ready Message ------------------------- -In addition to the common fields defined above, this message MUST contain the following key-value pairs: - -.. list-table:: - :header-rows: 1 - :widths: 1 1 1 - - * - Key - - Suggested Type - - Value - - * - message - - String - - "Connection ready" - - * - driverConnectionId - - Int64 - - The driver-generated ID for the connection as defined in `Connection <#connection>`_. - - * - durationMS - - Int64 - - ``ConnectionReadyEvent.duration`` converted to milliseconds. - -The unstructured form SHOULD be as follows, using the values defined in the structured format above to fill in placeholders as appropriate: - - Connection ready: address={{serverHost}}:{{serverPort}}, driver-generated ID={{driverConnectionId}}, established in={{durationMS}} ms - -Connection Closed Message -------------------------- -In addition to the common fields defined above, this message MUST contain the following key-value pairs: - -.. list-table:: - :header-rows: 1 - :widths: 1 1 1 - - * - Key - - Suggested Type - - Value - - * - message - - String - - "Connection closed" - - * - driverConnectionId - - Int64 - - The driver-generated ID for the connection as defined in `Connection <#connection>`_. - - * - reason - - String - - A string describing the reason the connection was closed. The following strings MUST be used for each possible reason - as defined in `Events <#events>`_ above: - - - Stale: "Connection became stale because the pool was cleared" - - Idle: "Connection has been available but unused for longer than the configured max idle time" - - Error: "An error occurred while using the connection" - - Pool closed: "Connection pool was closed" - - * - error - - Flexible - - If ``reason`` is ``Error``, the associated error. The type and format of this value is flexible; see the - `logging specification <../logging/logging.rst#representing-errors-in-log-messages>`__ for details on representing errors in log messages. - -The unstructured form SHOULD be as follows, using the values defined in the structured format above to fill in placeholders as appropriate: - - Connection closed: address={{serverHost}}:{{serverPort}}, driver-generated ID={{driverConnectionId}}. Reason: {{reason}}. Error: {{error}} - -Connection Checkout Started Message ------------------------------------ -In addition to the common fields defined above, this message MUST contain the following key-value pairs: - -.. list-table:: - :header-rows: 1 - :widths: 1 1 1 - - * - Key - - Suggested Type - - Value - - * - message - - String - - "Connection checkout started" - -The unstructured form SHOULD be as follows, using the values defined in the structured format above to fill in placeholders as appropriate: - - Checkout started for connection to {{serverHost}}:{{serverPort}} - -Connection Checkout Failed Message ------------------------------------ -In addition to the common fields defined above, this message MUST contain the following key-value pairs: - -.. list-table:: - :header-rows: 1 - :widths: 1 1 1 - - * - Key - - Suggested Type - - Value - - * - message - - String - - "Connection checkout failed" - - * - reason - - String - - A string describing the reason checkout. The following strings MUST be used for each possible reason - as defined in `Events <#events>`_ above: - - - Timeout: "Wait queue timeout elapsed without a connection becoming available" - - ConnectionError: "An error occurred while trying to establish a new connection" - - Pool closed: "Connection pool was closed" - - * - error - - Flexible - - If ``reason`` is ``ConnectionError``, the associated error. The type and format of this value is flexible; see the - `logging specification <../logging/logging.rst#representing-errors-in-log-messages>`__ for details on representing errors in log messages. - - * - durationMS - - Int64 - - ``ConnectionCheckOutFailedEvent.duration`` converted to milliseconds. - -The unstructured form SHOULD be as follows, using the values defined in the structured format above to fill in placeholders as appropriate: - - Checkout failed for connection to {{serverHost}}:{{serverPort}}. Reason: {{reason}}. Error: {{error}}. Duration: {{durationMS}} ms - -Connection Checked Out ------------------------ -In addition to the common fields defined above, this message MUST contain the following key-value pairs: - -.. list-table:: - :header-rows: 1 - :widths: 1 1 1 - - * - Key - - Suggested Type - - Value - - * - message - - String - - "Connection checked out" - - * - driverConnectionId - - Int64 - - The driver-generated ID for the connection as defined in `Connection <#connection>`_. - - * - durationMS - - Int64 - - ``ConnectionCheckedOutEvent.duration`` converted to milliseconds. - -The unstructured form SHOULD be as follows, using the values defined in the structured format above to fill in placeholders as appropriate: - - Connection checked out: address={serverHost}}:{{serverPort}}, driver-generated ID={{driverConnectionId}}, duration={{durationMS}} ms - -Connection Checked In ---------------------- -In addition to the common fields defined above, this message MUST contain the following key-value pairs: - -.. list-table:: - :header-rows: 1 - :widths: 1 1 1 - - * - Key - - Suggested Type - - Value - - * - message - - String - - "Connection checked in" - - * - driverConnectionId - - Int64 - - The driver-generated ID for the connection as defined in `Connection <#connection>`_. - -The unstructured form SHOULD be as follows, using the values defined in the structured format above to fill in placeholders as appropriate: - - Connection checked in: address={{serverHost}}:{{serverPort}}, driver-generated ID={{driverConnectionId}} - -Connection Pool Errors -~~~~~~~~~~~~~~~~~~~~~~ - -A connection pool throws errors in specific circumstances. These Errors -MUST be emitted by the pool. Errors SHOULD be created and dispatched in -a manner idiomatic to the Driver and Language. - -.. code:: typescript - - /** - * Thrown when the driver attempts to check out a - * Connection from a closed Connection Pool - */ - interface PoolClosedError { - message: 'Attempted to check out a Connection from closed connection pool'; - address: ; - } - - /** - * Thrown when the driver attempts to check out a - * Connection from a paused Connection Pool - */ - interface PoolClearedError extends RetryableError { - message: 'Connection pool for was cleared because another operation failed with: '; - address: ; - } - - /** - * Thrown when a driver times out when attempting to check out - * a Connection from a Pool - */ - interface WaitQueueTimeoutError { - message: 'Timed out while checking out a Connection from connection pool'; - address: ; - } - -Test Plan -========= - -See `tests/README.rst `_ - -Design Rationale -================ - -Why do we set minPoolSize across all members of a replicaSet, when most traffic will be against a Primary? -~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ - -Currently, we are attempting to codify our current pooling behavior with minimal changes, and minPoolSize is currently uniform across all members of a replicaSet. This has the benefit of offsetting connection swarming during a Primary Step-Down, which will be further addressed in our `Advanced Pooling Behaviors <#advanced-pooling-behaviors>`__. - -Why do we have separate ConnectionCreated and ConnectionReady events, but only one ConnectionClosed event? -~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ - -ConnectionCreated and ConnectionReady each involve different state changes in the pool. - -- ConnectionCreated adds a new “pending” `Connection <#connection>`_, meaning - the totalConnectionCount and pendingConnectionCount increase by one -- ConnectionReady establishes that the `Connection <#connection>`_ is ready for use, meaning the availableConnectionCount increases by one - -ConnectionClosed indicates that the `Connection <#connection>`_ is no longer a member of the pool, decrementing totalConnectionCount and potentially availableConnectionCount. After this point, the `Connection <#connection>`_ is no longer a part of the pool. Further hypothetical events would not indicate a change to the state of the pool, so they are not specified here. - -Why are waitQueueSize and waitQueueMultiple deprecated? -~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ - -These options were originally only implemented in three drivers (Java, C#, and Python), and provided little value. While these fields would allow for faster diagnosis of issues in the connection pool, they would not actually prevent an error from occurring. - -Additionally, these options have the effect of prioritizing older requests over newer requests, which is not necessarily the behavior that users want. They can also result in cases where queue access oscillates back and forth between full and not full. If a driver has a full waitQueue, then all requests for `Connections <#connection>`_ will be rejected. If the client is continually spammed with requests, you could wind up with a scenario where as soon as the waitQueue is no longer full, it is immediately filled. It is not a favorable situation to be in, partially b/c it violates the fairness guarantee that the waitQueue normally provides. - -Because of these issues, it does not make sense to `go against driver mantras and provide an additional knob <../../README.rst#>`__. We may eventually pursue an alternative configurations to address wait queue size in `Advanced Pooling Behaviors <#advanced-pooling-behaviors>`__. - -Users that wish to have this functionality can achieve similar results by utilizing other methods to limit concurrency. Examples include implementing either a thread pool or an operation queue with a capped size in the user application. Drivers that need to deprecate ``waitQueueSize`` and/or ``waitQueueMultiple`` SHOULD refer users to these examples. - -Why is waitQueueTimeoutMS optional for some drivers? -~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ - -We are anticipating eventually introducing a single client-side timeout mechanism, making us hesitant to introduce another granular timeout control. Therefore, if a driver/language already has an idiomatic way to implement their timeouts, they should leverage that mechanism over implementing waitQueueTimeoutMS. - -Why must populating the pool require the use of a background thread or async I/O? -~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ - -Without the use of a background thread, the pool is `populated -<#populating-the-pool-with-a-connection-internal-implementation>`_ with enough -connections to satisfy minPoolSize during checkOut. `Connections <#connection>`_ -are established as part of populating the pool though, so if `Connection -<#connection>`_ establishment were done in a blocking fashion, the first -operations after a clearing of the pool would experience unacceptably high -latency, especially for larger values of minPoolSize. Thus, populating the pool -must occur on a background thread (which is acceptable to block) or via the -usage of non-blocking (async) I/O. - -Why should closing a connection be non-blocking? -~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ - -Because idle and perished `Connections <#connection>`_ are cleaned up as part of -checkOut, performing blocking I/O while closing such `Connections <#connection>`_ -would block application threads, introducing unnecessary latency. Once -a `Connection <#connection>`_ is marked as "closed", it will not be checked out -again, so ensuring the socket is torn down does not need to happen -immediately and can happen at a later time, either via async I/O or a -background thread. - -Why can the pool be paused? -~~~~~~~~~~~~~~~~~~~~~~~~~~~ - -The distinction between the "paused" state and the "ready" state allows the pool -to determine whether or not the endpoint it is associated with is available or -not. This enables the following behaviors: - -1. The pool can halt the creation of background connection establishments until - the endpoint becomes available again. Without the "paused" state, the pool - would have no way of determining when to begin establishing background - connections again, so it would just continually attempt, and often fail, to - create connections until minPoolSize was satisfied, even after repeated - failures. This could unnecessarily waste resources both server and driver side. - -2. The pool can evict requests that enter the WaitQueue after the pool was - cleared but before the server was in a known state again. Such requests can - occur when a server is selected at the same time as it becomes marked as - Unknown in highly concurrent workloads. Without the "paused" state, the pool - would attempt to service these requests, since it would assume they were - routed to the pool because its endpoint was available, not because of a race - between SDAM and Server Selection. These requests would then likely fail with - potentially high latency, again wasting resources both server and driver side. - -Why not emit PoolCleared events and log messages when clearing a paused pool? -~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ - -If a pool is already paused when it is cleared, that means it was previously -cleared and no new connections have been created since then. Thus, clearing the -pool in this case is essentially a no-op, so there is no need to notify any -listeners that it has occurred. The generation is still incremented, however, to -ensure future errors that caused the duplicate clear will stop attempting to -clear the pool again. This situation is possible if the pool is cleared by the -background thread after it encounters an error establishing a connection, but -the ServerDescription for the endpoint was not updated accordingly yet. - -Why does the pool need to support interrupting in use connections as part of its clear logic? -~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ -If a SDAM monitor has observed a network timeout, we assume that all connections -including "in use" connections are no longer healthy. In some cases connections -will fail to detect the network timeout fast enough. For example, a server request -can hang at the OS level in TCP retry loop up for 17 minutes before failing. Therefore -these connections MUST be proactively interrupted in the case of a server monitor network timeout. -Requesting an immediate background thread run will speed up this process. - -Why don't we configure TCP_USER_TIMEOUT? -~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ -Ideally, a reasonable TCP_USER_TIMEOUT can help with detecting stale connections as an -alternative to `interruptInUseConnections` in Clear. -Unfortunately this approach is platform dependent and not each driver allows easily configuring it. -For example, C# driver can configure this socket option on linux only with target frameworks -higher or equal to .net 5.0. On macOS, there is no straight equivalent for this option, -it's possible that we can find some equivalent configuration, but this configuration will also -require target frameworks higher than or equal to .net 5.0. The advantage of using Background Thread to -manage perished connections is that it will work regardless of environment setup. - -Backwards Compatibility -======================= - -As mentioned in `Deprecated Options <#deprecated-options>`__, some drivers currently implement the options ``waitQueueSize`` and/or ``waitQueueMultiple``. These options will need to be deprecated and phased out of the drivers that have implemented them. - - -Reference Implementations -========================= - -- JAVA (JAVA-3079) -- RUBY (RUBY-1560) - -Future Development -================== - -SDAM -~~~~ - -This specification does not dictate how SDAM Monitoring connections are managed. SDAM specifies that “A monitor SHOULD NOT use the client's regular Connection pool”. Some possible solutions for this include: - -- Having each Endpoint representation in the driver create and manage a separate dedicated `Connection <#connection>`_ for monitoring purposes -- Having each Endpoint representation in the driver maintain a separate pool of maxPoolSize 1 for monitoring purposes. -- Having each Pool maintain a dedicated `Connection <#connection>`_ for monitoring purposes, with an API to expose that Connection. - -Advanced Pooling Behaviors -~~~~~~~~~~~~~~~~~~~~~~~~~~ - -This spec does not address all advanced pooling behaviors like predictive pooling or aggressive `Connection <#connection>`_ creation. Future work may address this. - -Add support for OP_MSG exhaustAllowed -~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ - -Exhaust Cursors may require changes to how we close `Connections <#connection>`_ in the future, specifically to add a way to close and remove from its pool a `Connection <#connection>`_ which has unread exhaust messages. - - -Changelog -========= - -:2019-06-06: Add "connectionError" as a valid reason for ConnectionCheckOutFailedEvent -:2020-09-03: Clarify Connection states and definition. Require the use of a - background thread and/or async I/O. Add tests to ensure - ConnectionReadyEvents are fired after ConnectionCreatedEvents. -:2020-09-24: Introduce maxConnecting requirement -:2020-12-17: Introduce "paused" and "ready" states. Clear WaitQueue on pool clear. -:2021-01-12: Clarify "clear" method behavior in load balancer mode. -:2021-01-19: Require that timeouts be applied per the client-side operations - timeout specification. -:2021-04-12: Adding in behaviour for load balancer mode. -:2021-06-02: Formalize the behavior of a `Background Thread <#background-thread>`__. -:2021-11-08: Make maxConnecting configurable. -:2022-04-05: Preemptively cancel in progress operations when SDAM heartbeats timeout. -:2022-10-05: Remove spec front matter and reformat changelog. -:2022-10-14: Add connection pool log messages and associated tests. -:2023-04-17: Fix duplicate logging test description. -:2023-08-04: Add durations to connection pool events. -:2023-10-04: Commit to the currently specified requirements regarding durations in events. - ----- - -.. Section for links. - -.. _Application Errors: /source/server-discovery-and-monitoring/server-discovery-and-monitoring.rst#application-errors -.. _Connection Pool Management: /source/server-discovery-and-monitoring/server-discovery-and-monitoring.rst#connection-pool-management diff --git a/source/connection-monitoring-and-pooling/tests/README.md b/source/connection-monitoring-and-pooling/tests/README.md new file mode 100644 index 0000000000..5d19b8ef13 --- /dev/null +++ b/source/connection-monitoring-and-pooling/tests/README.md @@ -0,0 +1,27 @@ +# Connection Monitoring and Pooling (CMAP) + +______________________________________________________________________ + +## Introduction + +Drivers MUST implement all of the following types of CMAP tests: + +- Pool unit and integration tests as described in [cmap-format/README](./cmap-format/README.md) +- Pool prose tests as described below in [Prose Tests](#prose-tests) +- Logging tests as described below in [Logging Tests](#logging-tests) + +## Prose Tests + +The following tests have not yet been automated, but MUST still be tested: + +1. All ConnectionPoolOptions MUST be specified at the MongoClient level +1. All ConnectionPoolOptions MUST be the same for all pools created by a MongoClient +1. A user MUST be able to specify all ConnectionPoolOptions via a URI string +1. A user MUST be able to subscribe to Connection Monitoring Events in a manner idiomatic to their language and driver +1. When a check out attempt fails because connection set up throws an error, assert that a ConnectionCheckOutFailedEvent + with reason="connectionError" is emitted. + +## Logging Tests + +Tests for connection pool logging can be found in the [/logging](./logging) subdirectory and are written in the +[Unified Test Format](../../unified-test-format/unified-test-format.rst). diff --git a/source/connection-monitoring-and-pooling/tests/README.rst b/source/connection-monitoring-and-pooling/tests/README.rst deleted file mode 100644 index ab7d8fac5c..0000000000 --- a/source/connection-monitoring-and-pooling/tests/README.rst +++ /dev/null @@ -1,36 +0,0 @@ -.. role:: javascript(code) - :language: javascript - -======================================== -Connection Monitoring and Pooling (CMAP) -======================================== - -.. contents:: - --------- - -Introduction -============ -Drivers MUST implement all of the following types of CMAP tests: - -* Pool unit and integration tests as described in `cmap-format/README.rst <./cmap-format/README.rst>`__ -* Pool prose tests as described below in `Prose Tests`_ -* Logging tests as described below in `Logging Tests`_ - -Prose Tests -=========== - -The following tests have not yet been automated, but MUST still be tested: - -#. All ConnectionPoolOptions MUST be specified at the MongoClient level -#. All ConnectionPoolOptions MUST be the same for all pools created by a MongoClient -#. A user MUST be able to specify all ConnectionPoolOptions via a URI string -#. A user MUST be able to subscribe to Connection Monitoring Events in a manner idiomatic to their language and driver -#. When a check out attempt fails because connection set up throws an error, - assert that a ConnectionCheckOutFailedEvent with reason="connectionError" is emitted. - -Logging Tests -============= - -Tests for connection pool logging can be found in the `/logging <./logging>`__ subdirectory and are written in the -`Unified Test Format <../../unified-test-format/unified-test-format.rst>`__. diff --git a/source/connection-monitoring-and-pooling/tests/cmap-format/README.md b/source/connection-monitoring-and-pooling/tests/cmap-format/README.md new file mode 100644 index 0000000000..50d12f3458 --- /dev/null +++ b/source/connection-monitoring-and-pooling/tests/cmap-format/README.md @@ -0,0 +1,166 @@ +# Connection Monitoring and Pooling (CMAP) Unit and Integration Tests + +______________________________________________________________________ + +## Introduction + +The YAML and JSON files in this directory are platform-independent tests that drivers can use to prove their conformance +to the Connection Monitoring and Pooling (CMAP) Spec. + +## Common Test Format + +Each YAML file has the following keys: + +- `version`: A version number indicating the expected format of the spec tests (current version = 1) +- `style`: A string indicating what style of tests this file contains. Contains one of the following: + - `"unit"`: a test that may be run without connecting to a MongoDB deployment. + - `"integration"`: a test that MUST be run against a real MongoDB deployment. +- `description`: A text description of what the test is meant to assert + +## Unit Test Format: + +All Unit Tests have some of the following fields: + +- `poolOptions`: If present, connection pool options to use when creating a pool; both + [standard ConnectionPoolOptions](../../connection-monitoring-and-pooling.md#connection-pool-options-1) and the + following test-specific options are allowed: + - `backgroundThreadIntervalMS`: A time interval between the end of a + [Background Thread Run](../../connection-monitoring-and-pooling.md#background-thread) and the beginning of the next + Run. If a Connection Pool does not implement a Background Thread, the Test Runner MUST ignore the option. If the + option is not specified, an implementation is free to use any value it finds reasonable. + + Possible values (0 is not allowed): + + - A negative value: never begin a Run. + - A positive value: the interval between Runs in milliseconds. +- `operations`: A list of operations to perform. All operations support the following fields: + - `name`: A string describing which operation to issue. + - `thread`: The name of the thread in which to run this operation. If not specified, runs in the default thread +- `error`: Indicates that the main thread is expected to error during this test. An error may include of the following + fields: + - `type`: the type of error emitted + - `message`: the message associated with that error + - `address`: Address of pool emitting error +- `events`: An array of all connection monitoring events expected to occur while running `operations`. An event may + contain any of the following fields + - `type`: The type of event emitted + - `address`: The address of the pool emitting the event + - `connectionId`: The id of a connection associated with the event + - `options`: Options used to create the pool + - `reason`: A reason giving more information on why the event was emitted +- `ignore`: An array of event names to ignore + +Valid Unit Test Operations are the following: + +- `start(target)`: Starts a new thread named `target` + - `target`: The name of the new thread to start +- `wait(ms)`: Sleep the current thread for `ms` milliseconds + - `ms`: The number of milliseconds to sleep the current thread for +- `waitForThread(target)`: wait for thread `target` to finish executing. Propagate any errors to the main thread. + - `target`: The name of the thread to wait for. +- `waitForEvent(event, count, timeout)`: block the current thread until `event` has occurred `count` times + - `event`: The name of the event + - `count`: The number of times the event must occur (counting from the start of the test) + - `timeout`: If specified, time out with an error after waiting for this many milliseconds without seeing the required + events +- `label = pool.checkOut()`: call `checkOut` on pool, returning the checked out connection + - `label`: If specified, associate this label with the returned connection, so that it may be referenced in later + operations +- `pool.checkIn(connection)`: call `checkIn` on pool + - `connection`: A string label identifying which connection to check in. Should be a label that was previously set + with `checkOut` +- `pool.clear()`: call `clear` on Pool + - `interruptInUseConnections`: Determines whether "in use" connections should be also interrupted +- `pool.close()`: call `close` on Pool +- `pool.ready()`: call `ready` on Pool + +## Integration Test Format + +The integration test format is identical to the unit test format with the addition of the following fields to each test: + +- `runOn` (optional): An array of server version and/or topology requirements for which the tests can be run. If the + test environment satisfies one or more of these requirements, the tests may be executed; otherwise, this test should + be skipped. If this field is omitted, the tests can be assumed to have no particular requirements and should be + executed. Each element will have some or all of the following fields: + - `minServerVersion` (optional): The minimum server version (inclusive) required to successfully run the tests. If + this field is omitted, it should be assumed that there is no lower bound on the required server version. + - `maxServerVersion` (optional): The maximum server version (inclusive) against which the tests can be run + successfully. If this field is omitted, it should be assumed that there is no upper bound on the required server + version. +- `failPoint`: optional, a document containing a `configureFailPoint` command to run against the endpoint being used for + the test. +- `poolOptions.appName` (optional): appName attribute to be set in connections, which will be affected by the fail + point. + +## Spec Test Match Function + +The definition of MATCH or MATCHES in the Spec Test Runner is as follows: + +- MATCH takes two values, `expected` and `actual` +- Notation is "Assert `actual` MATCHES `expected`" +- Assertion passes if `expected` is a subset of `actual`, with the values `42` and `"42"` acting as placeholders for + "any value" + +Pseudocode implementation of `actual` MATCHES `expected`: + +``` +If expected is "42" or 42: + Assert that actual exists (is not null or undefined) +Else: + Assert that actual is of the same JSON type as expected + If expected is a JSON array: + For every idx/value in expected: + Assert that actual[idx] MATCHES value + Else if expected is a JSON object: + For every key/value in expected + Assert that actual[key] MATCHES value + Else: + Assert that expected equals actual +``` + +## Unit Test Runner: + +For the unit tests, the behavior of a Connection is irrelevant beyond the need to asserting `connection.id`. Drivers MAY +use a mock connection class for testing the pool behavior in unit tests + +For each YAML file with `style: unit`: + +- Create a Pool `pool`, subscribe and capture any Connection Monitoring events emitted in order. + - If `poolOptions` is specified, use those options to initialize both pools + - The returned pool must have an `address` set as a string value. +- Process each `operation` in `operations` (on the main thread) + - If a `thread` is specified, the main thread MUST schedule the operation to execute in the corresponding thread. + Otherwise, execute the operation directly in the main thread. +- If `error` is presented + - Assert that an actual error `actualError` was thrown by the main thread + - Assert that `actualError` MATCHES `error` +- Else: + - Assert that no errors were thrown by the main thread +- calculate `actualEvents` as every Connection Event emitted whose `type` is not in `ignore` +- if `events` is not empty, then for every `idx`/`expectedEvent` in `events` + - Assert that `actualEvents[idx]` exists + - Assert that `actualEvents[idx]` MATCHES `expectedEvent` + +It is important to note that the `ignore` list is used for calculating `actualEvents`, but is NOT used for the +`waitForEvent` command + +## Integration Test Runner + +The steps to run the integration tests are the same as those used to run the unit tests with the following +modifications: + +- The integration tests MUST be run against an actual endpoint. If the deployment being tested contains multiple + endpoints, then the runner MUST only use one of them to run the tests against. + +- For each test, if `failPoint` is specified, its value is a `configureFailPoint` command. Run the command on the admin + database of the endpoint being tested to enable the fail point. + +- At the end of each test, any enabled fail point MUST be disabled to avoid spurious failures in subsequent tests. The + fail point may be disabled like so: + + ```javascript + db.adminCommand({ + configureFailPoint: "", + mode: "off" + }); + ``` diff --git a/source/connection-monitoring-and-pooling/tests/cmap-format/README.rst b/source/connection-monitoring-and-pooling/tests/cmap-format/README.rst deleted file mode 100644 index 52c16fef5d..0000000000 --- a/source/connection-monitoring-and-pooling/tests/cmap-format/README.rst +++ /dev/null @@ -1,215 +0,0 @@ -.. role:: javascript(code) - :language: javascript - -=================================================================== -Connection Monitoring and Pooling (CMAP) Unit and Integration Tests -=================================================================== - -.. contents:: - --------- - -Introduction -============ - -The YAML and JSON files in this directory are platform-independent tests that -drivers can use to prove their conformance to the Connection Monitoring and Pooling (CMAP) Spec. - -Common Test Format -================== - -Each YAML file has the following keys: - -- ``version``: A version number indicating the expected format of the spec tests (current version = 1) -- ``style``: A string indicating what style of tests this file contains. Contains one of the following: - - - ``"unit"``: a test that may be run without connecting to a MongoDB deployment. - - ``"integration"``: a test that MUST be run against a real MongoDB deployment. - -- ``description``: A text description of what the test is meant to assert - -Unit Test Format: -================= - -All Unit Tests have some of the following fields: - -- ``poolOptions``: If present, connection pool options to use when creating a pool; - both `standard ConnectionPoolOptions `__ - and the following test-specific options are allowed: - - - ``backgroundThreadIntervalMS``: A time interval between the end of a - `Background Thread Run `__ - and the beginning of the next Run. If a Connection Pool does not implement a Background Thread, the Test Runner MUST ignore the option. - If the option is not specified, an implementation is free to use any value it finds reasonable. - - Possible values (0 is not allowed): - - - A negative value: never begin a Run. - - A positive value: the interval between Runs in milliseconds. - -- ``operations``: A list of operations to perform. All operations support the following fields: - - - ``name``: A string describing which operation to issue. - - ``thread``: The name of the thread in which to run this operation. If not specified, runs in the default thread - -- ``error``: Indicates that the main thread is expected to error during this test. An error may include of the following fields: - - - ``type``: the type of error emitted - - ``message``: the message associated with that error - - ``address``: Address of pool emitting error - -- ``events``: An array of all connection monitoring events expected to occur while running ``operations``. An event may contain any of the following fields - - - ``type``: The type of event emitted - - ``address``: The address of the pool emitting the event - - ``connectionId``: The id of a connection associated with the event - - ``options``: Options used to create the pool - - ``reason``: A reason giving more information on why the event was emitted - -- ``ignore``: An array of event names to ignore - -Valid Unit Test Operations are the following: - -- ``start(target)``: Starts a new thread named ``target`` - - - ``target``: The name of the new thread to start - -- ``wait(ms)``: Sleep the current thread for ``ms`` milliseconds - - - ``ms``: The number of milliseconds to sleep the current thread for - -- ``waitForThread(target)``: wait for thread ``target`` to finish executing. Propagate any errors to the main thread. - - - ``target``: The name of the thread to wait for. - -- ``waitForEvent(event, count, timeout)``: block the current thread until ``event`` has occurred ``count`` times - - - ``event``: The name of the event - - ``count``: The number of times the event must occur (counting from the start of the test) - - ``timeout``: If specified, time out with an error after waiting for this many milliseconds without seeing the required events - -- ``label = pool.checkOut()``: call ``checkOut`` on pool, returning the checked out connection - - - ``label``: If specified, associate this label with the returned connection, so that it may be referenced in later operations - -- ``pool.checkIn(connection)``: call ``checkIn`` on pool - - - ``connection``: A string label identifying which connection to check in. Should be a label that was previously set with ``checkOut`` - -- ``pool.clear()``: call ``clear`` on Pool - - - ``interruptInUseConnections``: Determines whether "in use" connections should be also interrupted - -- ``pool.close()``: call ``close`` on Pool -- ``pool.ready()``: call ``ready`` on Pool - - -Integration Test Format -======================= - -The integration test format is identical to the unit test format with -the addition of the following fields to each test: - -- ``runOn`` (optional): An array of server version and/or topology requirements - for which the tests can be run. If the test environment satisfies one or more - of these requirements, the tests may be executed; otherwise, this test should - be skipped. If this field is omitted, the tests can be assumed to have no - particular requirements and should be executed. Each element will have some or - all of the following fields: - - - ``minServerVersion`` (optional): The minimum server version (inclusive) - required to successfully run the tests. If this field is omitted, it should - be assumed that there is no lower bound on the required server version. - - - ``maxServerVersion`` (optional): The maximum server version (inclusive) - against which the tests can be run successfully. If this field is omitted, - it should be assumed that there is no upper bound on the required server - version. - -- ``failPoint``: optional, a document containing a ``configureFailPoint`` - command to run against the endpoint being used for the test. - -- ``poolOptions.appName`` (optional): appName attribute to be set in connections, which will be affected by the fail point. - -Spec Test Match Function -======================== - -The definition of MATCH or MATCHES in the Spec Test Runner is as follows: - -- MATCH takes two values, ``expected`` and ``actual`` -- Notation is "Assert [actual] MATCHES [expected] -- Assertion passes if ``expected`` is a subset of ``actual``, with the values ``42`` and ``"42"`` acting as placeholders for "any value" - -Pseudocode implementation of ``actual`` MATCHES ``expected``: - -:: - - If expected is "42" or 42: - Assert that actual exists (is not null or undefined) - Else: - Assert that actual is of the same JSON type as expected - If expected is a JSON array: - For every idx/value in expected: - Assert that actual[idx] MATCHES value - Else if expected is a JSON object: - For every key/value in expected - Assert that actual[key] MATCHES value - Else: - Assert that expected equals actual - -Unit Test Runner: -================= - -For the unit tests, the behavior of a Connection is irrelevant beyond the need to asserting ``connection.id``. Drivers MAY use a mock connection class for testing the pool behavior in unit tests - -For each YAML file with ``style: unit``: - -- Create a Pool ``pool``, subscribe and capture any Connection Monitoring events emitted in order. - - - If ``poolOptions`` is specified, use those options to initialize both pools - - The returned pool must have an ``address`` set as a string value. - -- Process each ``operation`` in ``operations`` (on the main thread) - - - If a ``thread`` is specified, the main thread MUST schedule the operation to execute in the corresponding thread. Otherwise, execute the operation directly in the main thread. - -- If ``error`` is presented - - - Assert that an actual error ``actualError`` was thrown by the main thread - - Assert that ``actualError`` MATCHES ``error`` - -- Else: - - - Assert that no errors were thrown by the main thread - -- calculate ``actualEvents`` as every Connection Event emitted whose ``type`` is not in ``ignore`` -- if ``events`` is not empty, then for every ``idx``/``expectedEvent`` in ``events`` - - - Assert that ``actualEvents[idx]`` exists - - Assert that ``actualEvents[idx]`` MATCHES ``expectedEvent`` - - -It is important to note that the ``ignore`` list is used for calculating ``actualEvents``, but is NOT used for the ``waitForEvent`` command - -Integration Test Runner -======================= - -The steps to run the integration tests are the same as those used to run the -unit tests with the following modifications: - -- The integration tests MUST be run against an actual endpoint. If the - deployment being tested contains multiple endpoints, then the runner MUST - only use one of them to run the tests against. - -- For each test, if `failPoint` is specified, its value is a - ``configureFailPoint`` command. Run the command on the admin database of the - endpoint being tested to enable the fail point. - -- At the end of each test, any enabled fail point MUST be disabled to avoid - spurious failures in subsequent tests. The fail point may be disabled like - so:: - - db.adminCommand({ - configureFailPoint: , - mode: "off" - }); diff --git a/source/connections-survive-step-down/tests/README.rst b/source/connections-survive-step-down/tests/README.rst index 5128cd79f8..de52113939 100644 --- a/source/connections-survive-step-down/tests/README.rst +++ b/source/connections-survive-step-down/tests/README.rst @@ -172,7 +172,7 @@ methodology is in contrast to the one adopted by the SDAM spec tests that rely e server communication. -.. _CMAP: /source/connection-monitoring-and-pooling/connection-monitoring-and-pooling.rst -.. _PoolClearedEvent: /source/connection-monitoring-and-pooling/connection-monitoring-and-pooling.rst#events +.. _CMAP: ../../connection-monitoring-and-pooling/connection-monitoring-and-pooling.md +.. _PoolClearedEvent: /source/connection-monitoring-and-pooling/connection-monitoring-and-pooling.md#events .. _serverStatus: https://www.mongodb.com/docs/manual/reference/command/serverStatus .. _connections.totalCreated: https://www.mongodb.com/docs/manual/reference/command/serverStatus/#serverstatus.connections.totalCreated diff --git a/source/logging/logging.rst b/source/logging/logging.rst index dd9f806104..da4635711e 100644 --- a/source/logging/logging.rst +++ b/source/logging/logging.rst @@ -176,7 +176,7 @@ driver-specific messages they produce. * - connection - `Connection Monitoring and Pooling - <../connection-monitoring-and-pooling/connection-monitoring-and-pooling.rst>`__ + <../connection-monitoring-and-pooling/connection-monitoring-and-pooling.md>`__ - ``MONGODB_LOG_CONNECTION`` diff --git a/source/retryable-reads/retryable-reads.rst b/source/retryable-reads/retryable-reads.rst index 7238d87ba7..43eefd6966 100644 --- a/source/retryable-reads/retryable-reads.rst +++ b/source/retryable-reads/retryable-reads.rst @@ -71,7 +71,7 @@ SocketException 9001 - a `PoolClearedError`_ - .. _PoolClearedError: ../connection-monitoring-and-pooling/connection-monitoring-and-pooling.rst#connection-pool-errors + .. _PoolClearedError: ../connection-monitoring-and-pooling/connection-monitoring-and-pooling.md#connection-pool-errors - Any of the above retryable errors that occur during a connection handshake (including the authentication step). For example, a network error or ShutdownInProgress error diff --git a/source/retryable-writes/retryable-writes.rst b/source/retryable-writes/retryable-writes.rst index 4ba39c2218..6bdc2a6a65 100644 --- a/source/retryable-writes/retryable-writes.rst +++ b/source/retryable-writes/retryable-writes.rst @@ -238,7 +238,7 @@ The RetryableWriteError label might be added to an error in a variety of ways: the MongoClient performing the operation has the retryWrites configuration option set to true. - .. _PoolClearedError: ../connection-monitoring-and-pooling/connection-monitoring-and-pooling.rst#connection-pool-errors + .. _PoolClearedError: ../connection-monitoring-and-pooling/connection-monitoring-and-pooling.md#connection-pool-errors - For server versions 4.4 and newer, the server will add a RetryableWriteError label to errors or server responses that it considers retryable before diff --git a/source/server-discovery-and-monitoring/server-discovery-and-monitoring-logging-and-monitoring.rst b/source/server-discovery-and-monitoring/server-discovery-and-monitoring-logging-and-monitoring.rst index 1232e2c964..eba14b68a1 100644 --- a/source/server-discovery-and-monitoring/server-discovery-and-monitoring-logging-and-monitoring.rst +++ b/source/server-discovery-and-monitoring/server-discovery-and-monitoring-logging-and-monitoring.rst @@ -456,7 +456,7 @@ The following key-value pairs are common to all or several log messages and MUST - Heartbeat-related log messages - Int - The driver-generated ID for the monitoring connection as defined in the - `connection monitoring and pooling specification <../connection-monitoring-and-pooling/connection-monitoring-and-pooling.rst>`_. Unlike + `connection monitoring and pooling specification <../connection-monitoring-and-pooling/connection-monitoring-and-pooling.md>`_. Unlike ``connectionId`` in the above events, this field MUST NOT contain the host/port; that information MUST be in the above fields, ``serverHost`` and ``serverPort``. This field is optional for drivers that do not implement CMAP if they do have an equivalent concept of a connection ID. diff --git a/source/server-discovery-and-monitoring/server-discovery-and-monitoring.rst b/source/server-discovery-and-monitoring/server-discovery-and-monitoring.rst index 19f3769e25..e349fabeb2 100644 --- a/source/server-discovery-and-monitoring/server-discovery-and-monitoring.rst +++ b/source/server-discovery-and-monitoring/server-discovery-and-monitoring.rst @@ -2557,7 +2557,7 @@ Changelog .. _scanning order: server-monitoring.rst#scanning-order .. _clients update the topology from each handshake: server-monitoring.rst#clients-update-the-topology-from-each-handshake .. _single-threaded monitoring: server-monitoring.rst#single-threaded-monitoring -.. _Connection Monitoring and Pooling spec: /source/connection-monitoring-and-pooling/connection-monitoring-and-pooling.rst -.. _CMAP spec: /source/connection-monitoring-and-pooling/connection-monitoring-and-pooling.rst +.. _Connection Monitoring and Pooling spec: ../connection-monitoring-and-pooling/connection-monitoring-and-pooling.md +.. _CMAP spec: ../connection-monitoring-and-pooling/connection-monitoring-and-pooling.md .. _Authentication spec: /source/auth/auth.rst .. _Server Monitoring (Measuring RTT): server-monitoring.rst#measuring-rtt diff --git a/source/server-discovery-and-monitoring/server-monitoring.rst b/source/server-discovery-and-monitoring/server-monitoring.rst index d64b509a2c..6b7bde0faa 100644 --- a/source/server-discovery-and-monitoring/server-monitoring.rst +++ b/source/server-discovery-and-monitoring/server-monitoring.rst @@ -1251,10 +1251,10 @@ Changelog .. _SDAM Monitoring spec: server-discovery-and-monitoring-logging-and-monitoring.rst#heartbeats .. _OP_MSG Spec: /source/message/OP_MSG.rst .. _OP_MSG exhaustAllowed flag: /source/message/OP_MSG.rst#exhaustAllowed -.. _Connection Pool: /source/connection-monitoring-and-pooling/connection-monitoring-and-pooling.rst#Connection-Pool +.. _Connection Pool: /source/connection-monitoring-and-pooling/connection-monitoring-and-pooling.md#Connection-Pool .. _Why synchronize clearing a server's pool with updating the topology?: server-discovery-and-monitoring.rst#why-synchronize-clearing-a-server-s-pool-with-updating-the-topology? .. _Client Side Operations Timeout Spec: /source/client-side-operations-timeout/client-side-operations-timeout.rst .. _timeoutMS: /source/client-side-operations-timeout/client-side-operations-timeout.rst#timeoutMS -.. _Why does the pool need to support closing in use connections as part of its clear logic?: /source/connection-monitoring-and-pooling/connection-monitoring-and-pooling.rst#Why-does-the-pool-need-to-support-closing-in-use-connections-as-part-of-its-clear-logic? +.. _Why does the pool need to support closing in use connections as part of its clear logic?: /source/connection-monitoring-and-pooling/connection-monitoring-and-pooling.md#Why-does-the-pool-need-to-support-closing-in-use-connections-as-part-of-its-clear-logic? .. _DRIVERS-2246: https://jira.mongodb.org/browse/DRIVERS-2246 .. _MongoDB Handshake spec: /source/mongodb-handshake/handshake.rst#client-env diff --git a/source/server-selection/server-selection.rst b/source/server-selection/server-selection.rst index 8ba03731ce..fb7d185ded 100644 --- a/source/server-selection/server-selection.rst +++ b/source/server-selection/server-selection.rst @@ -2023,8 +2023,8 @@ References .. _Max Staleness: https://github.com/mongodb/specifications/tree/master/source/max-staleness .. _idleWritePeriodMS: https://github.com/mongodb/specifications/blob/master/source/max-staleness/max-staleness.rst#idlewriteperiodms .. _Driver Authentication: https://github.com/mongodb/specifications/blob/master/source/auth -.. _maxConnecting: /source/connection-monitoring-and-pooling/connection-monitoring-and-pooling.rst#connection-pool -.. _Connection Monitoring and Pooling: /source/connection-monitoring-and-pooling/connection-monitoring-and-pooling.rst +.. _maxConnecting: /source/connection-monitoring-and-pooling/connection-monitoring-and-pooling.md#connection-pool +.. _Connection Monitoring and Pooling: ../connection-monitoring-and-pooling/connection-monitoring-and-pooling.md .. _Global Command Argument: /source/message/OP_MSG.rst#global-command-arguments Changelog diff --git a/source/unified-test-format/unified-test-format.rst b/source/unified-test-format/unified-test-format.rst index 0e01d396ec..1a4af8c644 100644 --- a/source/unified-test-format/unified-test-format.rst +++ b/source/unified-test-format/unified-test-format.rst @@ -513,7 +513,7 @@ The structure of this object is as follows: connection pooling MUST track the number of connections checked out at any given time for the constructed MongoClient. This can be done using a single counter and `CMAP events - <../connection-monitoring-and-pooling/connection-monitoring-and-pooling.rst#events>`__. + <../connection-monitoring-and-pooling/connection-monitoring-and-pooling.md#events>`__. Each ``ConnectionCheckedOutEvent`` should increment the counter and each ``ConnectionCheckedInEvent`` should decrement it. @@ -853,7 +853,7 @@ The structure of this object is as follows: - ``events``: Required array of one or more strings, which denote the events to be collected. Currently, only the following - `CMAP <../connection-monitoring-and-pooling/connection-monitoring-and-pooling.rst>`__ + `CMAP <../connection-monitoring-and-pooling/connection-monitoring-and-pooling.md>`__ and `command monitoring <../command-logging-and-monitoring/command-logging-and-monitoring.rst>`__ events MUST be supported: @@ -1212,7 +1212,7 @@ The structure of each object is as follows: captured the events. Valid values are ``command`` for `Command Monitoring <../command-logging-and-monitoring/command-logging-and-monitoring.rst#events-api>`__ events, ``cmap`` for `CMAP - <../connection-monitoring-and-pooling/connection-monitoring-and-pooling.rst#events>`__ + <../connection-monitoring-and-pooling/connection-monitoring-and-pooling.md#events>`__ events, and ``sdam`` for `SDAM <../server-discovery-and-monitoring/server-discovery-and-monitoring-logging-and-monitoring.rst#events>`__ events. Defaults to ``command`` if omitted. @@ -1341,7 +1341,7 @@ expectedCmapEvent .. _expectedEvent_poolClearedEvent: - ``poolClearedEvent``: Optional object. Assertions for one or more - `PoolClearedEvent <../connection-monitoring-and-pooling/connection-monitoring-and-pooling.rst#events>`__ + `PoolClearedEvent <../connection-monitoring-and-pooling/connection-monitoring-and-pooling.md#events>`__ fields. The structure of this object is as follows: @@ -1367,7 +1367,7 @@ expectedCmapEvent .. _expectedEvent_connectionClosedEvent: - ``connectionClosedEvent``: Optional object. Assertions for one or more - `ConnectionClosedEvent <../connection-monitoring-and-pooling/connection-monitoring-and-pooling.rst#events>`__ + `ConnectionClosedEvent <../connection-monitoring-and-pooling/connection-monitoring-and-pooling.md#events>`__ fields. The structure of this object is as follows: @@ -1385,7 +1385,7 @@ expectedCmapEvent - ``connectionCheckOutFailedEvent``: Optional object. Assertions for one or more `ConnectionCheckOutFailedEvent - <../connection-monitoring-and-pooling/connection-monitoring-and-pooling.rst#events>`__ + <../connection-monitoring-and-pooling/connection-monitoring-and-pooling.md#events>`__ fields. The structure of this object is as follows: diff --git a/source/uri-options/uri-options.rst b/source/uri-options/uri-options.rst index f0f15e85c3..d5a88703c1 100644 --- a/source/uri-options/uri-options.rst +++ b/source/uri-options/uri-options.rst @@ -551,5 +551,5 @@ Changelog ---- -.. _Connection Pooling spec: https://github.com/mongodb/specifications/blob/master/source/connection-monitoring-and-pooling/connection-monitoring-and-pooling.rst#connection-pool-options-1 +.. _Connection Pooling spec: https://github.com/mongodb/specifications/blob/master/source/connection-monitoring-and-pooling/connection-monitoring-and-pooling.md#connection-pool-options-1 .. _SOCKS5 support spec: https://github.com/mongodb/specifications/blob/master/source/socks5-support/socks5.rst#mongoclient-configuration