From f5034fd1501c86971537692eb0e3ef0cb846f629 Mon Sep 17 00:00:00 2001 From: Steven Silvester Date: Thu, 7 Dec 2023 09:46:33 -0600 Subject: [PATCH] DRIVERS-2789 Use Markdown for Specifications Documentation --- .pre-commit-config.yaml | 12 + README.md | 160 +++ README.rst | 174 ---- scripts/README.md | 38 + scripts/migrate_to_md.py | 76 ++ .../client-side-operations-timeout.md | 905 +++++++++++++++++ .../client-side-operations-timeout.rst | 935 ------------------ 7 files changed, 1191 insertions(+), 1109 deletions(-) create mode 100644 README.md delete mode 100644 README.rst create mode 100644 scripts/README.md create mode 100644 scripts/migrate_to_md.py create mode 100644 source/client-side-operations-timeout/client-side-operations-timeout.md delete mode 100644 source/client-side-operations-timeout/client-side-operations-timeout.rst diff --git a/.pre-commit-config.yaml b/.pre-commit-config.yaml index fe915df299..64b98cc0ef 100644 --- a/.pre-commit-config.yaml +++ b/.pre-commit-config.yaml @@ -22,6 +22,18 @@ repos: args: ["--severity=error"] stages: [manual] +- repo: https://github.com/executablebooks/mdformat + rev: 0.7.17 + hooks: + - id: mdformat + additional_dependencies: + [mdformat-gfm] + +- repo: https://github.com/tcort/markdown-link-check + rev: v3.11.2 + hooks: + - id: markdown-link-check + - repo: https://github.com/rstcheck/rstcheck rev: v6.2.0 hooks: diff --git a/README.md b/README.md new file mode 100644 index 0000000000..9cafb26112 --- /dev/null +++ b/README.md @@ -0,0 +1,160 @@ +# MongoDB Specifications + +This repository holds in progress and completed specification for +features of MongoDB, Drivers, and associated products. Also contained is +a rudimentary system for producing these documents. + +## Driver Mantras + +When developing specifications -- and the drivers themselves -- we +follow the following principles: + +### Strive to be idiomatic, but favor consistency + +Drivers attempt to provide the easiest way to work with MongoDB in a +given language ecosystem, while specifications attempt to provide a +consistent behavior and experience across all languages. Drivers should +strive to be as idiomatic as possible while meeting the specification +and staying true to the original intent. + +### No Knobs + +Too many choices stress out users. Whenever possible, we aim to minimize +the number of configuration options exposed to users. In particular, if +a typical user would have no idea how to choose a correct value, we pick +a good default instead of adding a knob. + +### Topology agnostic + +Users test and deploy against different topologies or might scale up +from replica sets to sharded clusters. Applications should never need to +use the driver differently based on topology type. + +### Where possible, depend on server to return errors + +The features available to users depend on a server's version, topology, +storage engine and configuration. So that drivers don't need to code and +test all possible variations, and to maximize forward compatibility, +always let users attempt operations and let the server error when it +can't comply. Exceptions should be rare: for cases where the server +might not error and correctness is at stake. + +### Minimize administrative helpers + +Administrative helpers are methods for admin tasks, like user creation. +These are rarely used and have maintenance costs as the server changes +the administrative API. Don't create administrative helpers; let users +rely on "RunCommand" for administrative commands. + +### Check wire version, not server version + +When determining server capabilities within the driver, rely only on the +maxWireVersion in the hello response, not on the X.Y.Z server version. +An exception is testing server development releases, as the server bumps +wire version early and then continues to add features until the GA. + +### When in doubt, use "MUST" not "SHOULD" in specs + +Specs guide our work. While there are occasionally valid technical +reasons for drivers to differ in their behavior, avoid encouraging it +with a wishy-washy "SHOULD" instead of a more assertive "MUST". + +### Defy augury + +While we have some idea of what the server will do in the future, don't +design features with those expectations in mind. Design and implement +based on what is expected in the next release. + +Case Study: In designing OP_MSG, we held off on designing support for +Document Sequences in Replies in drivers until the server would support +it. We subsequently decided not to implement that feature in the server. + +### The best way to see what the server does is to test it + +For any unusual case, relying on documentation or anecdote to anticipate +the server's behavior in different versions/topologies/etc. is +error-prone. The best way to check the server's behavior is to use a +driver or the shell and test it directly. + +### Drivers follow semantic versioning + +Drivers should follow X.Y.Z versioning, where breaking API changes +require a bump to X. See [semver.org](https://semver.org/) for more. + +### Backward breaking behavior changes and semver + +Backward breaking behavior changes can be more dangerous and disruptive +than backward breaking API changes. When thinking about the implications +of a behavior change, ask yourself what could happen if a user upgraded +your library without carefully reading the changelog and/or adequately +testing the change. + +## Writing Documents + +Write documents using +[reStructuredText](http://docutils.sourceforge.net/rst.html), following +the [MongoDB Documentation Style +Guidelines](https://www.mongodb.com/docs/meta/style-guide/). + +Store all source documents in the `source/` directory. + +## Linting + +This repo uses [pre-commit](https://pypi.org/project/pre-commit/) for +managing linting. `pre-commit` performs various checks on the files and +uses tools that help follow a consistent style within the repo. + +To set up `pre-commit` locally, run: + +```bash +brew install pre-commit +pre-commit install +``` + +To run `pre-commit` manually, run `pre-commit run --all-files`. + +To run a manual hook like `rstcheck` manually, run: + +```bash +pre-commit run --all-files --hook-stage manual rstcheck +``` + +## Prose test numbering + +When numbering prose tests, always use relative numbered bullets (`#.`). +New tests must be appended at the end of the test list, since drivers +may refer to existing tests by number. + +Outdated tests must not be removed completely, but may be marked as such +(e.g. by striking through or replacing the entire test with a note (e.g. +**Removed**). + +## Building Documents + +We build the docs in `text` mode in CI to make sure they build without +errors. We don't actually support building html, since we rely on GitHub +to render the documents. To build locally, run: + +```bash +pip install sphinx +cd source +sphinx-build -W -b text . docs_build index.rst +``` + +## Converting to JSON + +There are many YAML to JSON converters. There are even several +converters called `yaml2json` in NPM. Alas, we are not using `yaml2json` +anymore, but instead the +[js-yaml](https://www.npmjs.com/package/js-yaml) package. Use only that +converter, so that JSON is formatted consistently. + +Run `npm install -g js-yaml`, then run `make` in the `source` directory +at the top level of this repository to convert all YAML test files to +JSON. + +## Licensing + +All the specs in this repository are available under the [Creative +Commons Attribution-NonCommercial-ShareAlike 3.0 United States +License](https://creativecommons.org/licenses/by-nc-sa/3.0/us/). diff --git a/README.rst b/README.rst deleted file mode 100644 index b0ed651f62..0000000000 --- a/README.rst +++ /dev/null @@ -1,174 +0,0 @@ -====================== -MongoDB Specifications -====================== - -This repository holds in progress and completed specification for -features of MongoDB, Drivers, and associated products. Also contained -is a rudimentary system for producing these documents. - -Driver Mantras --------------- - -When developing specifications -- and the drivers themselves -- we follow the -following principles: - -Strive to be idiomatic, but favor consistency -~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ - -Drivers attempt to provide the easiest way to work with MongoDB in a given -language ecosystem, while specifications attempt to provide a consistent -behavior and experience across all languages. Drivers should strive to be as -idiomatic as possible while meeting the specification and staying true to the -original intent. - -No Knobs -~~~~~~~~ - -Too many choices stress out users. Whenever possible, we aim to minimize the -number of configuration options exposed to users. In particular, if a typical -user would have no idea how to choose a correct value, we pick a good default -instead of adding a knob. - -Topology agnostic -~~~~~~~~~~~~~~~~~ - -Users test and deploy against different topologies or might scale up from -replica sets to sharded clusters. Applications should never need to use the -driver differently based on topology type. - -Where possible, depend on server to return errors -~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ - -The features available to users depend on a server's version, topology, storage -engine and configuration. So that drivers don't need to code and test all -possible variations, and to maximize forward compatibility, always let users -attempt operations and let the server error when it can't comply. Exceptions -should be rare: for cases where the server might not error and correctness is -at stake. - -Minimize administrative helpers -~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ - -Administrative helpers are methods for admin tasks, like user creation. These -are rarely used and have maintenance costs as the server changes the -administrative API. Don't create administrative helpers; let users rely on -"RunCommand" for administrative commands. - -Check wire version, not server version -~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ - -When determining server capabilities within the driver, rely only on the -maxWireVersion in the hello response, not on the X.Y.Z server version. An -exception is testing server development releases, as the server bumps wire -version early and then continues to add features until the GA. - -When in doubt, use "MUST" not "SHOULD" in specs -~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ - -Specs guide our work. While there are occasionally valid technical reasons for -drivers to differ in their behavior, avoid encouraging it with a wishy-washy -"SHOULD" instead of a more assertive "MUST". - -Defy augury -~~~~~~~~~~~ - -While we have some idea of what the server will do in the future, don't design -features with those expectations in mind. Design and implement based on what -is expected in the next release. - -Case Study: In designing OP_MSG, we held off on designing support for Document -Sequences in Replies in drivers until the server would support it. We -subsequently decided not to implement that feature in the server. - -The best way to see what the server does is to test it -~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ - -For any unusual case, relying on documentation or anecdote to anticipate the -server's behavior in different versions/topologies/etc. is error-prone. The -best way to check the server's behavior is to use a driver or the shell and -test it directly. - -Drivers follow semantic versioning -~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ - -Drivers should follow X.Y.Z versioning, where breaking API changes require a -bump to X. See `semver.org `_ for more. - -Backward breaking behavior changes and semver -~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ - -Backward breaking behavior changes can be more dangerous and disruptive than -backward breaking API changes. When thinking about the implications of a -behavior change, ask yourself what could happen if a user upgraded your library -without carefully reading the changelog and/or adequately testing the change. - -Writing Documents ------------------ - -Write documents using `reStructuredText`_, following the `MongoDB -Documentation Style Guidelines `_. - -Store all source documents in the ``source/`` directory. - -.. _`reStructuredText`: http://docutils.sourceforge.net/rst.html - -Linting -------- - -This repo uses `pre-commit `_ -for managing linting. -``pre-commit`` performs various checks on the files and uses tools -that help follow a consistent style within the repo. - -To set up ``pre-commit`` locally, run: - -.. code:: bash - - pip install pre-commit # or brew install pre-commit - pre-commit install - -To run ``pre-commit`` manually, run ``pre-commit run --all-files``. - -To run a manual hook like ``rstcheck`` manually, run: - -.. code:: bash - - pre-commit run --all-files --hook-stage manual rstcheck - -Prose test numbering --------------------- - -When numbering prose tests, always use relative numbered bullets (``#.``). New -tests must be appended at the end of the test list, since drivers may refer to -existing tests by number. - -Outdated tests must not be removed completely, but may be marked as such (e.g. -by striking through or replacing the entire test with a note (e.g. **Removed**). - -Building Documents ------------------- - -We build the docs in ``text`` mode in CI to make sure they build without errors. -We don't actually support building html, since we rely on GitHub to render the documents. -To build locally, run: - -.. code:: bash - - pip install sphinx - cd source - sphinx-build -W -b text . docs_build index.rst - -Converting to JSON ------------------- - -There are many YAML to JSON converters. There are even several converters called -``yaml2json`` in NPM. Alas, we are not using ``yaml2json`` anymore, but instead -the `js-yaml `_ package. Use only that -converter, so that JSON is formatted consistently. - -Run ``npm install -g js-yaml``, then run ``make`` in the ``source`` directory -at the top level of this repository to convert all YAML test files to JSON. - -Licensing ----------------- -All the specs in this repository are available under the `Creative Commons Attribution-NonCommercial-ShareAlike 3.0 United States License `_. diff --git a/scripts/README.md b/scripts/README.md new file mode 100644 index 0000000000..260f9ef75c --- /dev/null +++ b/scripts/README.md @@ -0,0 +1,38 @@ +# Scripts for use with Drivers specifications repository + +## migrate_to_md + +Use this file to automate the process of converting a +specification from rst to GitHub Flavored Markdown. + +### Prerequisites + +```bash +brew install pandoc +brew install pre-commit +brew install python # or get python through your preferred channel +pre-commit install +``` + +### Usage + +- Run the script as: + +```bash +python3 scripts/migrate_to_md.py "source/" +``` + +- Ensure that the generated markdown file is properly formatted. + +- Add a changelog entry for the migration of the spec. + +- Ensure the links in the file are up to date. As we migrate files, relative links that point to `.rst` files will need to be updated. + + - Run `pre-commit run markdown-link-check` and address failures until that passes. + - Run a `git grep` for the converted source file name + and update any relative links to use the new `.md` + extension. + +- Remove the rst file using `git rm`. + +- Create a PR. When you commit the changes, the `mdformat` `pre-commit` hook will update the formatting as appropriate. diff --git a/scripts/migrate_to_md.py b/scripts/migrate_to_md.py new file mode 100644 index 0000000000..e1b29c262f --- /dev/null +++ b/scripts/migrate_to_md.py @@ -0,0 +1,76 @@ +import subprocess +import re +import sys + +if len(sys.argv) < 2: + print('Must provide a path to an RST file') + sys.exit(1) + +path = sys.argv[1] + +# Get the contents of the file. +with open(path) as fid: + lines = fid.readlines() + + +# Pre-process the file. +for (i, line) in enumerate(lines): + # Replace the colon fence blocks with bullets, + # e.g. :Status:, :deprecated:, :changed:. + # This also includes the changelog entries. + match = re.match(r':(\S+):(.*)', line) + if match: + name, value = match.groups() + lines[i] = f'- {name.capitalize()}:{value}\n' + + # Handle "":Minimum Server Version:"" as a block quote. + if line.strip().startswith(':Minimum Server Version:'): + lines[i] = '- ' + line.strip()[1:] + '' + + + # Remove the "".. contents::" block - handled by GitHub UI. + if line.strip() == '.. contents::': + lines[i] = '' + + +# Run pandoc and capture output. +proc = subprocess.Popen(['pandoc', '-f', 'rst', '-t', 'gfm'], stdin=subprocess.PIPE, stdout=subprocess.PIPE) +data = ''.join(lines).encode('utf8') +outs, _ = proc.communicate(data) +data = outs.decode('utf8') + +# Fix the strings that were missing backticks. +data = re.sub(r'', '`', data, flags=re.MULTILINE) +data = data.replace('', '`') + +# Handle div blocks that were created. +# These are admonition blocks, convert to new GFM format. +in_block_outer = False +in_block_inner = False +lines = data.splitlines() +new_lines = [] +for (i, line) in enumerate(lines): + match = re.match(r'
',line) + if not in_block_outer and match: + in_block_outer = True + new_lines.append(f'> [!{match.groups()[0].upper()}]') + continue + if line.strip() == '
': + if in_block_outer: + in_block_outer = False + in_block_inner = True + elif in_block_inner: + in_block_inner = False + continue + if in_block_inner: + line = '> ' + line.strip() + if not in_block_outer: + new_lines.append(line) + +# Write the new content to the markdown file. +md_file = path.replace('.rst', '.md') +with open(md_file, 'w') as fid: + fid.write('\n'.join(new_lines)) + +print('Created markdown file:') +print(md_file) diff --git a/source/client-side-operations-timeout/client-side-operations-timeout.md b/source/client-side-operations-timeout/client-side-operations-timeout.md new file mode 100644 index 0000000000..481d8ed3ec --- /dev/null +++ b/source/client-side-operations-timeout/client-side-operations-timeout.md @@ -0,0 +1,905 @@ +# Client Side Operations Timeout + +- Status: Accepted +- Minimum Server Version: 2.6 + +______________________________________________________________________ + +## Abstract + +This specification outlines a new `timeoutMS` option to govern the +amount of time that a single operation can execute before control is +returned to the user. This timeout applies to all of the work done to +execute the operation, including but not limited to server selection, +connection checkout, and server-side execution. + +## META + +The keywords “MUST”, “MUST NOT”, “REQUIRED”, “SHALL”, “SHALL NOT”, +“SHOULD”, “SHOULD NOT”, “RECOMMENDED”, “MAY”, and “OPTIONAL” in this +document are to be interpreted as described in [RFC +2119](https://www.ietf.org/rfc/rfc2119.txt). + +## Specification + +### Terms + +min(a, b)\ +Shorthand for "the minimum of a and b" where `a` and `b` are numeric +values. For any cases where 0 means "infinite" (e.g. +[timeoutMS](#timeoutms)), `min(0, other)` MUST evaluate to `other`. + +### MongoClient Configuration + +This specification introduces a new configuration option and deprecates +some existing options. + +#### timeoutMS + +This 64-bit integer option specifies the per-operation timeout value in +milliseconds. The default value is unset which means this feature is not +enabled, i.e. the existing timeout behavior is unchanged (including +`serverSelectionTimeoutMS`, `connectTimeoutMS`, `socketTimeoutMS` +etc..). An explicit value of 0 means infinite, though some client-side +timeouts like `serverSelectionTimeoutMS` will still apply. Drivers MUST +error if a negative value is specified. This value MUST be configurable +at the level of a MongoClient, MongoDatabase, MongoCollection, or of a +single operation. However, if the option is specified at any level, it +cannot be later changed to unset. At each level, the value MUST be +inherited from the previous level if it is not explicitly specified. +Additionally, some entities like `ClientSession` and `GridFSBucket` +either inherit `timeoutMS` from their parent entities or provide options +to override it. The behavior for these entities is described in +individual sections of this specification. + +Drivers for languages that provide an idiomatic API for expressing +durations of time (e.g. `TimeSpan` in .NET) MAY choose to leverage these +APIs for the `timeoutMS` option rather than using int64. Drivers that +choose to do so MUST also follow the semantics for special values +defined by those types. Such drivers MUST also ensure that there is a +way to explicitly set `timeoutMS` to `infinite` in the API. + +See [timeoutMS cannot be changed to unset once it’s +specified](#timeoutms-cannot-be-changed-to-unset-once-its-specified). + +#### Backwards Breaking Considerations + +This specification deprecates many existing timeout options and +introduces a new exception type that is used to communicate timeout +expiration. If drivers need to make backwards-breaking changes to +support `timeoutMS`, the backwards breaking behavior MUST be gated +behind the presence of the `timeoutMS` option. If the `timeoutMS` option +is not set, drivers MUST continue to honor existing timeouts such as +`socketTimeoutMS`. Backwards breaking changes include any changes to +exception types thrown by stable API methods or changes to timeout +behavior. Drivers MUST document these changes. + +In a subsequent major release, drivers SHOULD drop support for legacy +timeout behavior and only continue to support the timeout options that +are not deprecated by this specification. Once legacy options are +removed, drivers MUST make the backwards-breaking behavioral changes +described in this specification regardless of whether or not `timeoutMS` +is set by the application. + +See the [Errors](#errors) section for explanations of the +backwards-breaking changes to error reporting. + +#### Deprecations + +The following configuration timeout options MUST be deprecated in favor +of `timeoutMS`: + +- `socketTimeoutMS` +- `waitQueueTimeoutMS` +- `wTimeoutMS` + +The following options for CRUD methods MUST be deprecated in favor of +`timeoutMS`: + +- `maxTimeMS` +- `maxCommitTimeMS` + +### Timeout Behavior + +The `timeoutMS` option specifies the best-effort maximum amount of time +a single operation can take before control is returned to the +application. Drivers MUST keep track of the remaining time before the +timeout expires as an operation progresses. + +#### Operations + +The `timeoutMS` option applies to all operations defined in the +following specifications: + +- [CRUD](./../crud/crud.rst) +- [Change Streams](../change-streams/change-streams.rst) +- [Client Side + Encryption](../client-side-encryption/client-side-encryption.rst) +- [Enumerating Collections](../enumerate-collections.rst) +- [Enumerating Databases](../enumerate-databases.rst) +- [GridFS](../gridfs/gridfs-spec.rst) +- [Index Management](../index-management/index-management.rst) +- [Transactions](../transactions/transactions.rst) +- [Convenient API for + Transactions](../transactions-convenient-api/transactions-convenient-api.rst) + +In addition, it applies to all operations on cursor objects that may +perform blocking work (e.g. methods to iterate or close a cursor, any +method that reads documents from a cursor into an array, etc). + +#### Validation and Overrides + +When executing an operation, drivers MUST ignore any deprecated timeout +options if `timeoutMS` is set on the operation or is inherited from the +collection/database/client levels. In addition to being set at these +levels, the timeout for an operation can also be expressed via an +explicit ClientSession (see [Convenient Transactions +API](#convenient-transactions-api)). In this case, the timeout on the +session MUST be used as the `timeoutMS` value for the operation. Drivers +MUST raise a validation error if an explicit session with a timeout is +used and the `timeoutMS` option is set at the operation level for +operations executed as part of a `withTransaction` callback. + +See [timeoutMS overrides deprecated timeout +options](#timeoutms-overrides-deprecated-timeout-options). + +#### Errors + +If the `timeoutMS` option is not set and support for deprecated timeout +options has not been dropped but a timeout is encountered (e.g. server +selection times out), drivers MUST continue to return existing errors. +This ensures that error-handling code in existing applications does not +break unless the user opts into using `timeoutMS`. + +If the `timeoutMS` option is set and the timeout expires, drivers MUST +abort all blocking work and return control to the user with an error. +This error MUST be distinguished in some way (e.g. custom exception +type) to make it easier for users to detect when an operation fails due +to a timeout. If the timeout expires during a blocking task, drivers +MUST expose the underlying error returned from the task from this new +error type. The stringified version of the new error type MUST include +the stringified version of the underlying error as a substring. For +example, if server selection expires and returns a +`ServerSelectionTimeoutException`, drivers must allow users to access +that exception from this new error type. If there is no underlying +error, drivers MUST add information about when the timeout expiration +was detected to the stringified version of the timeout error. + +##### Error Transformations + +When using the new timeout error type, drivers MUST transform timeout +errors from external sources into the new error. One such error is the +`MaxTimeMSExpired` server error. When checking for this error, drivers +MUST only check that the error code is 50 and MUST NOT check the code +name or error message. This error can be present in a top-level response +document where the `ok` value is 0, as part of an error in the +`writeErrors` array, or in a nested `writeConcernError` document. For +example, all three of the following server responses would match this +criteria: + +```javascript +{ok: 0, code: 50, codeName: "MaxTimeMSExpired", errmsg: "operation time limit exceeded"} + +{ok: 1, writeErrors: [{code: 50, codeName: "MaxTimeMSExpired", errmsg: "operation time limit exceeded"}]} + +{ok: 1, writeConcernError: {code: 50, codeName: "MaxTimeMSExpired"}} +``` + +Timeouts from other sources besides MongoDB servers MUST also be +transformed into this new exception type. These include socket +read/write timeouts and HTTP request timeouts. + +#### Blocking Sections for Operation Execution + +The following pieces of operation execution are considered blocking: + +1. Implicit session acquisition if an explicit session was not provided + for the operation. This is only considered blocking for drivers that + perform server selection to determine session support when acquiring + implicit sessions. +1. Server selection +1. Connection checkout - If `maxPoolSize` has already been reached for + the selected server, this is the amount of time spent waiting for a + connection to be available. +1. Connection establishment - If the pool for the selected server is + empty and a new connection is needed, the following pieces of + connection establishment are considered blocking: + 1. TCP socket establishment + 1. TLS handshake + 1. All messages sent over the socket as part of the TLS + handshake + 1. OCSP verification - HTTP requests sent to OCSP responders. + 1. MongoDB handshake (i.e. initial connection `hello`) + 1. Authentication + 1. SCRAM-SHA-1, SCRAM-SHA-256, PLAIN: Execution of the command + required for the SASL conversation. + 1. GSSAPI: Execution of the commands required for the SASL + conversation and requests to the KDC and TGS. + 1. MONGODB-AWS: Execution of the commands required for the SASL + conversation and all HTTP requests to ECS and EC2 endpoints. + 1. MONGODB-X509: Execution of the commands required for the + authentication conversation. +1. Client-side encryption + 1. Execution of `listCollections` commands to get collection + schemas. + 1. Execution of `find` commands against the key vault collection to + get encrypted data keys. + 1. Requests to non-local key management servers (e.g. AWS KMS) to + decrypt data keys. + 1. Requests to mongocryptd servers. +1. Socket write to send a command to the server +1. Socket read to receive the server’s response + +The `timeoutMS` option MUST apply to all blocking sections. Drivers MUST +document any exceptions. For example, drivers that do not have full +control over OCSP verification might not be able to set timeouts for +HTTP requests to responders and would document that OCSP verification +could result in an execution time greater than `timeoutMS`. + +#### Server Selection + +If `timeoutMS` is set, drivers MUST use +`min(serverSelectionTimeoutMS, remaining timeoutMS)`, referred to as +`computedServerSelectionTimeout` as the timeout for server selection and +connection checkout. The server selection loop MUST fail with a timeout +error once the timeout expires. + +After a server has been selected, drivers MUST use the remaining +`computedServerSelectionTimeout` value as the timeout for connection +checkout. If a new connection is required, +`min(connectTimeoutMS, remaining computedServerSelectionTimeout)` MUST +be used as the timeout for TCP socket establishment. Any network +requests required to create or authenticate a connection (e.g. HTTP +requests to OCSP responders) MUST use +`min(operationTimeout, remaining computedServerSelectionTimeout)` as a +timeout, where `operationTimeout` is the specified default timeout for +the network request. If there is no specified default, these operations +MUST use the remaining `computedServerSelectionTimeout` value. All +commands sent during the connection’s handshake MUST use the remaining +`computedServerSelectionTimeout` as their `timeoutMS` value. Handshake +commands MUST also set timeouts per the [Command +Execution](#command-execution) section. + +If `timeoutMS` is not set and support for `waitQueueTimeoutMS` has not +been removed, drivers MUST continue to exhibit the existing timeout +behavior by honoring `serverSelectionTimeoutMS` for server selection and +`waitQueueTimeoutMS` for connection checkout. If a new connection is +required, drivers MUST use `connectTimeoutMS` as the timeout for socket +establishment and `socketTimeoutMS` as the socket timeout for all +handshake commands. + +See [serverSelectionTimeoutMS is not +deprecated](#serverselectiontimeoutms-is-not-deprecated) and +[connectTimeoutMS is not +deprecated](#connecttimeoutms-is-not-deprecated). + +#### Command Execution + +If `timeoutMS` is set, drivers MUST append a `maxTimeMS` field to +commands executed against a MongoDB server using the `minRoundTripTime` +field of the selected server. Note that this value MUST be retrieved +during server selection using the `servers` field of the same +[TopologyDescription](../server-discovery-and-monitoring/server-discovery-and-monitoring.rst#TopologyDescription) +that was used for selection before the selected server's description can +be modified. Otherwise, drivers may be subject to a race condition where +a server is reset to the default description (e.g. due to an error in +the monitoring thread) after it has been selected but before the RTT is +retrieved. + +If the `minRoundTripTime` is less than the remaining timeoutMS, the +value of this field MUST be `remaining timeoutMS - minRoundTripTime`. If +not, drivers MUST return a timeout error without attempting to send the +message to the server. This is done to ensure that an operation is not +routed to the server if it will likely fail with a socket timeout as +that could cause connection churn. The `maxTimeMS` field MUST be +appended after all blocking work is complete. + +After wire message construction, drivers MUST check for timeout before +writing the message to the server. If the timeout has expired or the +amount of time remaining is less than the selected server's minimum RTT, +drivers MUST return the connection to the pool and raise a timeout +exception. Otherwise, drivers MUST set the connection’s write timeout to +the remaining `timeoutMS` value before writing a message to the server. +After the write is complete, drivers MUST check for timeout expiration +before reading the server’s response. If the timeout has expired, the +connection MUST be closed and a timeout exception MUST be propagated to +the application. If it has not, drivers MUST set the connection’s read +timeout to the remaining `timeoutMS` value. The timeout MUST apply to +the aggregate of all reads done to receive a server response, not to +individual reads. If any read or write calls on the socket fail with a +timeout, drivers MUST transform the error into the new timeout exception +as described in the [Error Transformations](#error-transformations) +section. + +If `timeoutMS` is not set and support for `socketTimeoutMS` has not been +removed, drivers MUST honor `socketTimeoutMS` as the timeout for socket +reads and writes. + +See [maxTimeMS accounts for server +RTT](#maxtimems-accounts-for-server-rtt). + +#### Batching + +If an operation must be sent to the server in multiple batches (e.g. +`collection.bulkWrite()`), the `timeoutMS` option MUST apply to the +entire operation, not to each individual batch. + +#### Retryability + +If an operation requires a retry per the retryable reads or writes +specifications and `timeoutMS` is set, drivers MUST retry operations as +many times as possible before the timeout expires or a retry attempt +returns a non-retryable error. Once the timeout expires, a timeout error +MUST be raised. + +See [Why don’t drivers use backoff/jitter between retry +attempts?](#why-dont-drivers-use-backoffjitter-between-retry-attempts). + +#### Client Side Encryption + +If automatic client-side encryption or decryption is enabled, the +remaining `timeoutMS` value MUST be used as the `timeoutMS` when +executing `listCollections` commands to retrieve collection schemas, +`find` commands to get data from the key vault, and any commands against +mongocryptd. It MUST also be used as the request timeout for HTTP +requests against KMS servers to decrypt data keys. When sending a +command to mongocryptd, drivers MUST NOT append a `maxTimeMS` field. +This is to ensure that a `maxTimeMS` field can be safely appended to the +command after it has been marked by mongocryptd and encrypted by +libmongocrypt. To determine whether or not the server is a mongocryptd, +drivers MUST check that the `iscryptd` field in the server's description +is `true`. + +For explicit encryption and decryption, the `ClientEncryptionOpts` +options type used to construct +[ClientEncryption](../client-side-encryption/client-side-encryption.rst#clientencryption) +instances MUST support a new `timeoutMS` option, which specifies the +timeout for all operations executed on the `ClientEncryption` object. + +See [maxTimeMS is not added for +mongocryptd](#maxtimems-is-not-added-for-mongocryptd). + +### Background Connection Pooling + +Connections created as part of a connection pool’s `minPoolSize` +maintenance routine MUST use `connectTimeoutMS` as the timeout for +connection establishment. After the connection is established, if +`timeoutMS` is set at the MongoClient level, it MUST be used as the +timeout for all commands sent as part of the MongoDB or authentication +handshakes. The timeout MUST be refreshed after each command. These +commands MUST set timeouts per the [Command +Execution](#command-execution) section. If `timeoutMS` is not set, +drivers MUST continue to honor `socketTimeoutMS` as the socket timeout +for handshake and authentication commands. + +### Server Monitoring + +Drivers MUST NOT use `timeoutMS` for commands executed by the server +monitoring and RTT calculation threads. + +See [Monitoring threads do not use +timeoutMS](#monitoring-threads-do-not-use-timeoutms). + +### Cursors + +For operations that create cursors, `timeoutMS` can either cap the +lifetime of the cursor or be applied separately to the original +operation and all `next` calls. To support both of these use cases, +these operations MUST support a `timeoutMode` option. This option is an +enum with possible values `CURSOR_LIFETIME` and `ITERATION`. The default +value depends on the type of cursor being created. Drivers MUST error if +`timeoutMode` is set and `timeoutMS` is not. + +When applying the `timeoutMS` option to `next` calls on cursors, drivers +MUST ensure it applies to the entire call, not individual commands. For +drivers that send `getMore` requests in a loop when iterating tailable +cursors, the timeout MUST apply to the totality of all `getMore`’s, not +to each one individually. If a resume is required for a `next` call on a +change stream, the timeout MUST apply to the entirety of the initial +`getMore` and all commands sent as part of the resume attempt. + +For `close` methods, drivers MUST allow `timeoutMS` to be overridden if +doing so is possible in the language. If explicitly set for the +operation, it MUST be honored. Otherwise, if `timeoutMS` was applied to +the operation that created the cursor, it MUST be refreshed for the +`killCursors` command if one is required. Note that this means +`timeoutMS` will be refreshed for the `close` call even if the cursor +was created with a `timeoutMode` of `CURSOR_LIFETIME` and the timeout +associated with the cursor has expired. The calculated timeout MUST +apply to explicit `close` methods that can be invoked by users as well +as implicit destructors that are automatically invoked when exiting +resource blocks. + +See [Cursor close() methods refresh +timeoutMS](#cursor-close-methods-refresh-timeoutms). + +#### Non-tailable Cursors + +For non-tailable cursors, the default value of `timeoutMode` is +`CURSOR_LIFETIME`. If `timeoutMS` is set, drivers MUST apply it to the +original operation and the lifetime of the created cursor. For example, +if a `find` is executed at time `T`, the `find` and all `getMore`’s on +the cursor must finish by time `T + timeoutMS`. When executing `next` +calls on the cursor, drivers MUST use the remaining timeout as the +`timeoutMS` value for the operation but MUST NOT append a `maxTimeMS` +field to `getMore` commands. If there are documents remaining in a +previously retrieved batch, the `next` method MUST return them even if +the timeout has expired and MUST only return a timeout error if a +`getMore` is required. + +If `timeoutMode` is set to `ITERATION`, drivers MUST raise a client-side +error if the operation is an `aggregate` with a `$out` or `$merge` +pipeline stage. If the operation is not an `aggregate` with `$out` or +`$merge`, drivers MUST honor the `timeoutMS` option for the initial +command but MUST NOT append a `maxTimeMS` field to the command sent to +the server. After the operation has executed, the original `timeoutMS` +value MUST also be applied to each `next` call on the created cursor. +Drivers MUST NOT append a `maxTimeMS` field to `getMore` commands. + +See [Non-tailable cursor behavior](#non-tailable-cursor-behavior). + +#### Tailable Cursors + +Tailable cursors only support the `ITERATION` value for the +`timeoutMode` option. This is the default value and drivers MUST error +if the option is set to `CURSOR_LIFETIME`. + +##### Tailable non-awaitData Cursors + +If `timeoutMS` is set, drivers MUST apply it separately to the original +operation and to all `next` calls on the resulting cursor but MUST NOT +append a `maxTimeMS` field to any commands. + +##### Tailable awaitData Cursors + +If `timeoutMS` is set, drivers MUST apply it to the original operation. +Drivers MUST also apply the original `timeoutMS` value to each `next` +call on the resulting cursor but MUST NOT use it to derive a `maxTimeMS` +value for `getMore` commands. Helpers for operations that create +tailable awaitData cursors MUST also support the `maxAwaitTimeMS` +option. Drivers MUST error if this option is set, `timeoutMS` is set to +a non-zero value, and `maxAwaitTimeMS` is greater than or equal to +`timeoutMS`. If this option is set, drivers MUST use it as the +`maxTimeMS` field on `getMore` commands. + +See [Tailable cursor behavior](#tailable-cursor-behavior) for rationale +regarding both non-awaitData and awaitData cursors. + +#### Change Streams + +Driver `watch` helpers MUST support both `timeoutMS` and +`maxAwaitTimeMS` options. Drivers MUST error if `maxAwaitTimeMS` is set, +`timeoutMS` is set to a non-zero value, and `maxAwaitTimeMS` is greater +than or equal to `timeoutMS`. These helpers MUST NOT support the +`timeoutMode` option as change streams are an abstraction around +tailable-awaitData cursors, so they implicitly use `ITERATION` mode. If +set, drivers MUST apply the `timeoutMS` option to the initial +`aggregate` operation. Drivers MUST also apply the original `timeoutMS` +value to each `next` call on the change stream but MUST NOT use it to +derive a `maxTimeMS` field for `getMore` commands. If the +`maxAwaitTimeMS` option is set, drivers MUST use it as the `maxTimeMS` +field on `getMore` commands. + +If a `next` call fails with a timeout error, drivers MUST NOT invalidate +the change stream. The subsequent `next` call MUST perform a resume +attempt to establish a new change stream on the server. Any errors from +the `aggregate` operation done to create a new change stream MUST be +propagated to the application. Drivers MUST document that users can +either call `next` again or close the existing change stream and create +a new one if a previous `next` call times out. The documentation MUST +suggest closing and re-creating the stream with a higher timeout if the +timeout occurs before any events have been received because this is a +signal that the server is timing out before it can finish processing the +existing oplog. + +See [Change stream behavior](#change-stream-behavior). + +### Sessions + +The +[SessionOptions](../sessions/driver-sessions.rst#mongoclient-changes) +used to construct explicit +[ClientSession](../sessions/driver-sessions.rst#clientsession) instances +MUST accept a new `defaultTimeoutMS` option, which specifies the +`timeoutMS` value for the following operations executed on the session: + +1. commitTransaction +1. abortTransaction +1. withTransaction +1. endSession + +If this option is not specified for a `ClientSession`, it MUST inherit +the `timeoutMS` of its parent MongoClient. + +#### Session checkout + +As noted in [Blocking Sections for Operation +Execution](#blocking-sections-for-operation-execution), implicit session +checkout can be considered a blocking process for some drivers. Such +drivers MUST apply the remaining `timeoutMS` value to this process when +executing an operation. For explicit session checkout, drivers MUST +apply the `timeoutMS` value of the MongoClient to the `startSession` +call if set. Drivers MUST NOT allow users to override `timeoutMS` for +`startSession` operations. + +See [timeoutMS cannot be overridden for startSession +calls](#timeoutms-cannot-be-overridden-for-startsession-calls). + +#### Convenient Transactions API + +If `timeoutMS` is set, drivers MUST apply it to the entire +`withTransaction` call. To propagate the timeout to the user-supplied +callback, drivers MUST store the timeout as a field on the ClientSession +object. This field SHOULD be private to ensure that a user can not +modify it while a `withTransaction` call is in progress. Drivers that +cannot make this field private MUST signal that the field should not be +accessed or modified by users if there is an idiomatic way to do so in +the language (e.g. underscore-prefixed variable names in Python) and +MUST document that modification of the field can cause unintended +correctness issues for applications. Drivers MUST document that the +remaining timeout will not be applied to callback operations that do not +use the ClientSession. Drivers MUST also document that overridding +`timeoutMS` for operations executed using the explict session inside the +provided callback will result in a client-side error, as defined in +[Validation and Overrides](#validation-and-overrides). If the callback +returns an error and the transaction must be aborted, drivers MUST +refresh the `timeoutMS` value for the `abortTransaction` operation. + +If `timeoutMS` is not set, drivers MUST continue to exhibit the existing +120 second timeout behavior. Drivers MUST NOT change existing +implementations to use `timeoutMS=120000` for this case. + +See [withTransaction communicates timeoutMS via +ClientSession](#withtransaction-communicates-timeoutms-via-clientsession) +and [withTransaction refreshes the timeout for +abortTransaction](#withtransaction-refreshes-the-timeout-for-aborttransaction). + +### GridFS API + +GridFS buckets MUST inherit `timeoutMS` from their parent MongoDatabase +instance and all methods in the GridFS Bucket API MUST support the +`timeoutMS` option. For methods that create streams (e.g. +`open_upload_stream`), the option MUST cap the lifetime of the entire +stream. This MUST include the time taken by any operations executed +during stream construction, reads/writes, and close/abort calls. For +example, if a stream is created at time `T`, the final `close` call on +the stream MUST finish all blocking work before time `T + timeoutMS`. +Methods that interact with a user-provided stream (e.g. +`upload_from_stream`) MUST use `timeoutMS` as the timeout for the entire +upload/download operation. If the user-provided streams do not support +timeouts, drivers MUST document that the timeout for these methods may +be breached if calls to interact with the stream take longer than the +remaining timeout. If `timeoutMS` is set, all cursors created for GridFS +API operations MUST internally set the `timeoutMode` option to +`CURSOR_LIFETIME`. + +See [GridFS streams behavior](#gridfs-streams-behavior). + +### RunCommand + +The behavior of `runCommand` is undefined if the provided command +document includes a `maxTimeMS` field and the `timeoutMS` option is set. +Drivers MUST document the behavior of `runCommand` for this case and +MUST NOT attempt to check the command document for the presence of a +`maxTimeMS` field. + +See [runCommand behavior](#runcommand-behavior). + +## Test Plan + +See the +[README.rst](https://github.com/mongodb/specifications/blob/master/source/client-side-operations-timeout/tests/README.rst) +in the tests directory. + +## Motivation for Change + +Users have many options to set timeouts for various parts of operation +execution including, but not limited to, `serverSelectionTimeoutMS`, +`socketTimeoutMS`, `connectTimeoutMS`, `maxTimeMS`, and `wTimeoutMS`. As +a result, users are often unsure which timeout to use. Because some of +these timeouts are additive, it is difficult to set a combination which +ensures control will be returned to the user after a specified amount of +time. To make timeouts more intuitive, changes are required to the +drivers API to deprecate some of the existing timeouts and add a new one +to specify the maximum execution time for an entire operation from start +to finish. + +In addition, automatically retrying reads and writes that failed due to +transient network blips or planned maintenance scenarios has improved +application resiliency but the original behavior of only retrying once +still allowed some errors to be propagated to applications. Supporting a +timeout for an entire operation allows drivers to retry operations +multiple times while still guaranteeing that an application can get back +control once the specified amount of time has elapsed. + +## Design Rationale + +### timeoutMS cannot be changed to unset once it’s specified + +If `timeoutMS` is specified at any level, it cannot be later changed to +unset at a lower level. For example, a user cannot do: + +```python +client = MongoClient(uri, timeoutMS=1000) +db = client.database("foo", timeoutMS=None) +``` + +This is because drivers return existing exception types if `timeoutMS` +is not specified, but will return new exception types and use new +timeout behaviors if it is. Once the user has opted into this behavior, +we should not allow them to opt out of it at a lower level. If a user +wishes to set the timeout to infinite for a specific database, +collection, or operation, they can explicitly set `timeoutMS` to 0. + +### serverSelectionTimeoutMS is not deprecated + +The original goal of the project was to expose a single timeout and +deprecate all others. This was not possible, however, because executing +an operation consists of two distinct parts. The first is selecting a +server and checking out a connection from its pool. This should have a +default timeout because failure to do this indicates that the deployment +is not in a healthy state or that there was a configuration error which +prevents the driver from successfully connecting. The second is +server-side operation execution, which cannot have a default timeout. +Some operations finish in a few milliseconds, while others can run for +many hours. Adding a default would inevitably break applications. To +accomplish both of these goals, `serverSelectionTimeoutMS` was preserved +and is used to timeout the client-side section of operation execution. + +### connectTimeoutMS is not deprecated + +Similar to the reasoning for not deprecating `serverSelectionTimeoutMS`, +socket establishment should have a default timeout because failure to +create a socket likely means that the target server is not healthy or +there is a network issue. To accomplish this, the `connectTimeoutMS` +option is not deprecated by this specification. Drivers also use +`connectTimeoutMS` to derive a socket timeout for monitoring +connections, which are not subject to timeoutMS. + +### timeoutMS overrides deprecated timeout options + +Applying both `timeoutMS` and a deprecated timeout option like +`socketTimeoutMS` at the same time would lead to confusing semantics +that are difficult to document and understand. When first writing this +specification, we considered having drivers error in this situation to +catch mismatched timeouts as early as possible. However, because +`timeoutMS` can be set at any level, this behavior could lead to +unanticipated runtime errors if an application set `timeoutMS` for a +specific operation and the MongoClient used in production was configured +with a deprecated timeout option. To have clear semantics and avoid +unexpected errors in applications, we decided that `timeoutMS` should +override deprecated timeout options. + +### maxTimeMS is not added for mongocryptd + +The mongocryptd server annotates the provided command to indicate +encryption requirements and returns the marked up result. If the command +sent to mongocryptd contained `maxTimeMS`, the final command sent to +MongoDB would contain two `maxTimeMS` fields: one added by the regular +MongoClient and another added by the mongocryptd client. To avoid this +complication, drivers do not add this field when sending commands to +mongocryptd at all. Doing so does not sacrifice any functionality +because mongocryptd always runs on localhost and does not perform any +blocking work, so execution or network timeouts cannot occur. + +### maxTimeMS accounts for server RTT + +When constructing a command, drivers use the `timeoutMS` option to +derive a value for the `maxTimeMS` command option and the socket +timeout. The full time to round trip a command is (network RTT + +server-side execution time). If both `maxTimeMS` and socket timeout were +set to the same value, the server would never be able to respond with a +`MaxTimeMSExpired` error because drivers would hit the socket timeout +first and close the connection. This would lead to connection churn if +the specified timeout is too low. To allow the server to gracefully +error and avoid churn, drivers must account for the network round trip +in the `maxTimeMS` calculation. + +### Monitoring threads do not use timeoutMS + +Using `timeoutMS` in the monitoring and RTT calculation threads would +require another special case in the code that derives `maxTimeMS` from +`timeoutMS` because the awaitable `hello` requests sent to 4.4+ servers +already have a `maxAwaitTimeMS` field. Adding `maxTimeMS` also does not +help for non-awaitable `hello` commands because we expect them to +execute quickly on the server. The Server Monitoring spec already +mandates that drivers set and dynamically update the read/write timeout +of the dedicated connections used in monitoring threads, so we rely on +that to time out commands rather than adding complexity to the behavior +of `timeoutMS`. + +### runCommand behavior + +The behavior of runCommand varies across drivers. If the provided +command document includes a `maxTimeMS` field and the `timeoutMS` option +is set, some drivers would overwrite the `maxTimeMS` field with the +value derived from `timeoutMS`, while others would append a second +`maxTimeMS` field, which would cause a server error on versions 3.4+. To +be prescriptive, we could mandate that drivers raise a client-side error +in this case, but this would require a potentially expensive lookup in +the command document. To avoid this additional cost, drivers are only +required to document the behavior and suggest that `timeoutMS` be used +instead of including a manual `maxTimeMS` field. + +### Why don’t drivers use backoff/jitter between retry attempts? + +Earlier versions of this specification proposed adding backoff and/or +jitter between retry attempts to avoid connection storming or +overloading the server, but we later deemed this unnecessary. If +multiple concurrent operations select the same server for a retry and +its connection pool is empty, we rely on the `maxConnecting` parameter +introduced in DRIVERS-781 to rate limit new connection attempts, which +mitigates the risk of connection storms. Even if the new server has +enough connections in its pool to service the operations, recent server +versions do very little resource-intensive work until execution reaches +the storage layer, which is already guarded by read/write tickets, so we +don’t expect the server to be overwhelmed. If we later decide that +adding jitter would be useful, it may be easier to do so in the server +itself via a ticket-based admission system earlier in the execution +stack. + +### Cursor close() methods refresh timeoutMS + +If a cursor times out client-side (e.g. a non-tailable cursor created +with `timeoutMode=CURSOR_LIFETIME`), it’s imperative that drivers make a +good-faith effort to close the server-side cursor even though the +timeout has expired because failing to do so would leave resources open +on the server for a potentially long time. It was decided that +`timeoutMS` will be refreshed for `close` operations to allow the cursor +to be killed server-side. + +### Non-tailable cursor behavior + +There are two usage patterns for non-tailable cursors. The first is to +read documents from a cursor into an iterable object, either by +explicitly iterating the cursor in a loop or using a language construct +like Python list comprehensions. To supply a timeout for the entire +process, drivers use `timeoutMS` to cap the execution time for the +initial command and all required `getMore`’s. This use case also matches +the server behavior; if `maxTimeMS` is set for an operation that creates +a non-tailable cursor, the server will use the time limit to cap the +total server-side execution time for future `getMore`’s. Because this +type of usage matches the server behavior and is the more common case, +this is the default behavior. + +The second use case is batch processing, where the user takes advantage +of the lazy nature of cursors to process documents from a large +collection. In this case, the user does not want all documents from the +collection to be in an array because that would require too much memory. +To accommodate this use case, drivers support a new `timeoutMode` +option. Users can set the value for this option to `ITERATION` to have +`timeoutMS` apply to the original command and then separately to each +`next` call. When this option is used, drivers do not set `maxTimeMS` on +the initial command to avoid capping the cursor lifetime in the server. + +### Tailable cursor behavior + +Once a tailable cursor is created, it conceptually lives forever. +Therefore, it only makes sense to support `timeoutMode=ITERATION` for +these cursors and drivers error if `timeoutMode=CURSOR_LIFETIME` is +specified. + +There are two types of tailable cursors. The first, tailable +non-awaitData cursors, support `maxTimeMS` for the original command but +not for any `getMore` requests. However, setting `maxTimeMS` on the +original command also incorrectly caps the server-side execution time +for future `getMore`’s +([SERVER-51153](http://jira.mongodb.org/browse/SERVER-51153)). This is +undesirable behavior because it does not match the guarantees made by +`timeoutMode=ITERATION`. To work around this, drivers honor `timeoutMS` +for both the original operation and all `getMore`’s but only use it to +derive client-side timeouts and do not append a `maxTimeMS` field to any +commands. The server-side execution time is enforced via socket +timeouts. + +The second type is tailable awaitData cursors. The server supports the +`maxTimeMS` option for the original command. For `getMore`’s, the option +is supported, but instead of limiting the server-side execution time, it +specifies how long the server should wait for new data to arrive if it +reaches the end of the capped collection and the batch is still empty. +If no new data arrives within that time limit, the server will respond +with an empty batch. For these cursors, drivers support both the +`timeoutMS` and `maxAwaitTimeMS` options. The `timeoutMS` option is used +to derive client-side timeouts, while the `maxAwaitTimeMS` option is +used as the `maxTimeMS` field for `getMore` commands. These values have +distinct meanings, so supporting both yields a more robust, albeit +verbose, API. Drivers error if `maxAwaitTimeMS` is greater than or equal +to `timeoutMS` because in that case, `getMore` requests would not +succeed if the batch was empty: the server would wait for +`maxAwaitTimeMS`, but the driver would close the socket after +`timeoutMS`. + +### Change stream behavior + +Change streams internally behave as tailable awaitData cursors, so the +behavior of the `timeoutMS` option is the same for both. The main +difference is that change streams are resumable and drivers +automatically perform resume attempts when they encounter transient +errors. This allows change streams to be resilient to timeouts. If +`timeoutMS` expires during a next call, drivers can’t auto-resume, but +they can make sure the change stream is not invalidated so the user can +call next again. In this case, the subsequent call would perform the +resume without doing a `getMore` first. + +### withTransaction communicates timeoutMS via ClientSession + +Because the `withTransaction` API doesn’t allow drivers to plumb down +the remaining timeout into the user-provided callback, this spec +requires the remaining timeout to be stored on the ClientSession. +Operations in the callback that run under that ClientSession can then +extract the timeout from the session and apply it. To avoid confusing +validation semantics, operations error if there is a timeout on the +session but also an overridden timeout for the operation. It’s possible +that the ability to communicate timeouts for a block of operations via a +ClientSession is useful as a general purpose API, but we’ve decided to +make it private until there are other known use cases. + +### withTransaction refreshes the timeout for abortTransaction + +If the user-provided callback to `withTransaction` times out, it could +leave a transaction running on the server. It’s imperative that drivers +make an effort to abort the open transaction because failing to do so +could result in the collections and databases affected by the +transaction being locked for a long period of time, which could cause +applications to stall. Because `timeoutMS` has expired before drivers +attempt to abort the transaction, we require drivers to refresh it and +apply the original value to the execution of the `abortTransaction` +operation. This can cause the entire `withTransaction` call to take up +to `2*timeoutMS`, but it was decided that this risk is worthwhile given +the importance of transaction cleanup. + +### GridFS streams behavior + +Streams created by GridFS API operations (e.g. by `open_upload_stream` +and `open_download_stream`) present a challenge for this specification. +These types of streams execute multiple operations, but there can be +artificial gaps between operations if the application does not invoke +the stream functions for long periods of time. Generally, we expect +users to upload or download an entire file as quickly as possible, so we +decided to have `timeoutMS` cap the lifetime of the created stream. The +other option was to apply the entire `timeoutMS` value to each operation +executed by the stream, but streams perform many hidden operations, so +this approach could cause an upload/download to take much longer than +expected. + +### timeoutMS cannot be overridden for startSession calls + +In general, users can override `timeoutMS` at the level of a single +operation. The `startSession` operation, however, only inherits +`timeoutMS` from the MongoClient and does not allow the option to be +overridden. This was a consious API design decision because drivers are +moving towards only supporting MongoDB versions 3.6 and higher, so +sessions will always be supported. Adding an override for `startSession` +would introduce a new knob and increase the API surface of drivers +without providing a significant benefit. + +### Drivers use minimum RTT to short circuit operations + +A previous version of this spec used the 90th percentile RTT to short +circuit operations that might otherwise fail with a socket timeout. We +decided to change this logic to avoid canceling operations that may have +a high chance of succeeding and also remove a dependency on t-digest. +Instead, drivers use the minimum RTT from the last 10 samples, or 0 +until at least 2 samples have been recorded. + +## Future work + +### Modify GridFS streams behavior via new options + +As explained in the design rationale, drivers use `timeoutMS` to cap the +entire lifetime of streams created by GridFS operations. If we find that +users are often encountering timeout errors when using these APIs due to +the time spent during non-MongoDB operations (e.g. streaming data read +from a GridFS stream into another data store), we could consider +toggling GridFS behavior via an option similiar to `timeoutMode` for +cursors. To avoid backwards-breaking behavioral changes, the default +would continue to cap the stream lifetime but there could be another +mode that refreshes the timeout for each database operation. This would +mimic using `timeoutMode=ITERATION` for cursors. + +## Changelog + +- 2022-10-05: Remove spec front matter. +- 2022-01-19: Initial version. +- 2022-11-17: Use minimum RTT for maxTimeMS calculation instead of 90th + percentile RTT. diff --git a/source/client-side-operations-timeout/client-side-operations-timeout.rst b/source/client-side-operations-timeout/client-side-operations-timeout.rst deleted file mode 100644 index d75f421ead..0000000000 --- a/source/client-side-operations-timeout/client-side-operations-timeout.rst +++ /dev/null @@ -1,935 +0,0 @@ -============================== -Client Side Operations Timeout -============================== - -:Status: Accepted -:Minimum Server Version: 2.6 - -.. contents:: - --------- - -Abstract -======== - -This specification outlines a new ``timeoutMS`` option to govern the amount -of time that a single operation can execute before control is returned to the -user. This timeout applies to all of the work done to execute the operation, -including but not limited to server selection, connection checkout, and -server-side execution. - -META -==== - -The keywords “MUST”, “MUST NOT”, “REQUIRED”, “SHALL”, “SHALL NOT”, -“SHOULD”, “SHOULD NOT”, “RECOMMENDED”, “MAY”, and “OPTIONAL” in this -document are to be interpreted as described in `RFC 2119 -`_. - -Specification -============= - -Terms ------ - -min(a, b) - Shorthand for "the minimum of a and b" where ``a`` and ``b`` are numeric - values. For any cases where 0 means "infinite" (e.g. `timeoutMS`_), - ``min(0, other)`` MUST evaluate to ``other``. - -MongoClient Configuration -------------------------- - -This specification introduces a new configuration option and deprecates some -existing options. - -timeoutMS -~~~~~~~~~ - -This 64-bit integer option specifies the per-operation timeout value in -milliseconds. The default value is unset which means this feature is not -enabled, i.e. the existing timeout behavior is unchanged (including -``serverSelectionTimeoutMS``, ``connectTimeoutMS``, ``socketTimeoutMS`` etc..). -An explicit value of 0 means infinite, though some client-side timeouts like -``serverSelectionTimeoutMS`` will still apply. Drivers MUST error if a -negative value is specified. This value MUST be configurable at the level of -a MongoClient, MongoDatabase, MongoCollection, or of a single operation. -However, if the option is specified at any level, it cannot be later changed -to unset. At each level, the value MUST be inherited from the previous level -if it is not explicitly specified. Additionally, some entities like -``ClientSession`` and ``GridFSBucket`` either inherit ``timeoutMS`` from -their parent entities or provide options to override it. The behavior for -these entities is described in individual sections of this specification. - -Drivers for languages that provide an idiomatic API for expressing durations -of time (e.g. ``TimeSpan`` in .NET) MAY choose to leverage these APIs for the -``timeoutMS`` option rather than using int64. Drivers that choose to do so -MUST also follow the semantics for special values defined by those types. -Such drivers MUST also ensure that there is a way to explicitly set -``timeoutMS`` to ``infinite`` in the API. - -See `timeoutMS cannot be changed to unset once it’s specified`_. - -Backwards Breaking Considerations -~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ - -This specification deprecates many existing timeout options and introduces a -new exception type that is used to communicate timeout expiration. If drivers -need to make backwards-breaking changes to support ``timeoutMS``, the -backwards breaking behavior MUST be gated behind the presence of the -``timeoutMS`` option. If the ``timeoutMS`` option is not set, drivers MUST -continue to honor existing timeouts such as ``socketTimeoutMS``. Backwards -breaking changes include any changes to exception types thrown by stable API -methods or changes to timeout behavior. Drivers MUST document these changes. - -In a subsequent major release, drivers SHOULD drop support for legacy timeout -behavior and only continue to support the timeout options that are not -deprecated by this specification. Once legacy options are removed, drivers -MUST make the backwards-breaking behavioral changes described in this -specification regardless of whether or not ``timeoutMS`` is set by the -application. - -See the `Errors`_ section for explanations of the backwards-breaking changes -to error reporting. - -Deprecations -~~~~~~~~~~~~ - -The following configuration timeout options MUST be deprecated in favor of -``timeoutMS``: - -- ``socketTimeoutMS`` - -- ``waitQueueTimeoutMS`` - -- ``wTimeoutMS`` - -The following options for CRUD methods MUST be deprecated in favor of -``timeoutMS``: - -- ``maxTimeMS`` - -- ``maxCommitTimeMS`` - -Timeout Behavior ----------------- - -The ``timeoutMS`` option specifies the best-effort maximum amount of time a -single operation can take before control is returned to the application. -Drivers MUST keep track of the remaining time before the timeout expires as -an operation progresses. - -Operations -~~~~~~~~~~ - -The ``timeoutMS`` option applies to all operations defined in the following -specifications: - -- `CRUD <./../crud/crud.rst>`__ -- `Change Streams <../change-streams/change-streams.rst>`__ -- `Client Side Encryption <../client-side-encryption/client-side-encryption.rst>`__ -- `Enumerating Collections <../enumerate-collections.rst>`__ -- `Enumerating Databases <../enumerate-databases.rst>`__ -- `GridFS <../gridfs/gridfs-spec.rst>`__ -- `Index Management <../index-management/index-management.rst>`__ -- `Transactions <../transactions/transactions.rst>`__ -- `Convenient API for Transactions <../transactions-convenient-api/transactions-convenient-api.rst>`__ - -In addition, it applies to all operations on cursor objects that may perform -blocking work (e.g. methods to iterate or close a cursor, any method that -reads documents from a cursor into an array, etc). - -Validation and Overrides -~~~~~~~~~~~~~~~~~~~~~~~~ - -When executing an operation, drivers MUST ignore any deprecated timeout -options if ``timeoutMS`` is set on the operation or is inherited from the -collection/database/client levels. In addition to being set at these levels, -the timeout for an operation can also be expressed via an explicit -ClientSession (see `Convenient Transactions API`_). In this case, the timeout -on the session MUST be used as the ``timeoutMS`` value for the operation. -Drivers MUST raise a validation error if an explicit session with a timeout -is used and the ``timeoutMS`` option is set at the operation level for -operations executed as part of a ``withTransaction`` callback. - -See `timeoutMS overrides deprecated timeout options`_. - -Errors -~~~~~~ - -If the ``timeoutMS`` option is not set and support for deprecated timeout -options has not been dropped but a timeout is encountered (e.g. server -selection times out), drivers MUST continue to return existing errors. This -ensures that error-handling code in existing applications does not break -unless the user opts into using ``timeoutMS``. - -If the ``timeoutMS`` option is set and the timeout expires, drivers MUST -abort all blocking work and return control to the user with an error. This -error MUST be distinguished in some way (e.g. custom exception type) to make -it easier for users to detect when an operation fails due to a timeout. If -the timeout expires during a blocking task, drivers MUST expose the -underlying error returned from the task from this new error type. The -stringified version of the new error type MUST include the stringified -version of the underlying error as a substring. For example, if server -selection expires and returns a ``ServerSelectionTimeoutException``, drivers -must allow users to access that exception from this new error type. If there -is no underlying error, drivers MUST add information about when the timeout -expiration was detected to the stringified version of the timeout error. - -Error Transformations -````````````````````` - -When using the new timeout error type, drivers MUST transform timeout errors -from external sources into the new error. One such error is the -``MaxTimeMSExpired`` server error. When checking for this error, drivers MUST -only check that the error code is 50 and MUST NOT check the code name or -error message. This error can be present in a top-level response document -where the ``ok`` value is 0, as part of an error in the ``writeErrors`` -array, or in a nested ``writeConcernError`` document. For example, all three -of the following server responses would match this criteria: - -.. code:: javascript - - {ok: 0, code: 50, codeName: "MaxTimeMSExpired", errmsg: "operation time limit exceeded"} - - {ok: 1, writeErrors: [{code: 50, codeName: "MaxTimeMSExpired", errmsg: "operation time limit exceeded"}]} - - {ok: 1, writeConcernError: {code: 50, codeName: "MaxTimeMSExpired"}} - -Timeouts from other sources besides MongoDB servers MUST also be transformed -into this new exception type. These include socket read/write timeouts and -HTTP request timeouts. - -Blocking Sections for Operation Execution -~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ - -The following pieces of operation execution are considered blocking: - -#. Implicit session acquisition if an explicit session was not provided for the - operation. This is only considered blocking for drivers that perform server - selection to determine session support when acquiring implicit sessions. -#. Server selection -#. Connection checkout - If ``maxPoolSize`` has already been reached for the - selected server, this is the amount of time spent waiting for a connection to - be available. -#. Connection establishment - If the pool for the selected server is - empty and a new connection is needed, the following pieces of connection - establishment are considered blocking: - - #. TCP socket establishment - - #. TLS handshake - - #. All messages sent over the socket as part of the TLS handshake - - #. OCSP verification - HTTP requests sent to OCSP responders. - - #. MongoDB handshake (i.e. initial connection ``hello``) - - #. Authentication - - #. SCRAM-SHA-1, SCRAM-SHA-256, PLAIN: Execution of the command required - for the SASL conversation. - - #. GSSAPI: Execution of the commands required for the SASL conversation - and requests to the KDC and TGS. - - #. MONGODB-AWS: Execution of the commands required for the SASL - conversation and all HTTP requests to ECS and EC2 endpoints. - - #. MONGODB-X509: Execution of the commands required for the - authentication conversation. - -#. Client-side encryption - - #. Execution of ``listCollections`` commands to get collection schemas. - - #. Execution of ``find`` commands against the key vault collection to get - encrypted data keys. - - #. Requests to non-local key management servers (e.g. AWS KMS) to decrypt - data keys. - - #. Requests to mongocryptd servers. - -#. Socket write to send a command to the server - -#. Socket read to receive the server’s response - -The ``timeoutMS`` option MUST apply to all blocking sections. Drivers MUST -document any exceptions. For example, drivers that do not have full control -over OCSP verification might not be able to set timeouts for HTTP requests to -responders and would document that OCSP verification could result in an -execution time greater than ``timeoutMS``. - -Server Selection -~~~~~~~~~~~~~~~~ - -If ``timeoutMS`` is set, drivers MUST use ``min(serverSelectionTimeoutMS, -remaining timeoutMS)``, referred to as ``computedServerSelectionTimeout`` as -the timeout for server selection and connection checkout. The server selection -loop MUST fail with a timeout error once the timeout expires. - -After a server has been selected, drivers MUST use the remaining -``computedServerSelectionTimeout`` value as the timeout for connection -checkout. If a new connection is required, ``min(connectTimeoutMS, remaining -computedServerSelectionTimeout)`` MUST be used as the timeout for TCP socket -establishment. Any network requests required to create or authenticate a -connection (e.g. HTTP requests to OCSP responders) MUST use -``min(operationTimeout, remaining computedServerSelectionTimeout)`` as a -timeout, where ``operationTimeout`` is the specified default timeout for the -network request. If there is no specified default, these operations MUST use -the remaining ``computedServerSelectionTimeout`` value. All commands sent -during the connection’s handshake MUST use the remaining -``computedServerSelectionTimeout`` as their ``timeoutMS`` value. Handshake -commands MUST also set timeouts per the `Command Execution`_ section. - -If ``timeoutMS`` is not set and support for ``waitQueueTimeoutMS`` has not -been removed, drivers MUST continue to exhibit the existing timeout behavior -by honoring ``serverSelectionTimeoutMS`` for server selection and -``waitQueueTimeoutMS`` for connection checkout. If a new connection is -required, drivers MUST use ``connectTimeoutMS`` as the timeout for socket -establishment and ``socketTimeoutMS`` as the socket timeout for all handshake -commands. - -See `serverSelectionTimeoutMS is not deprecated`_ and `connectTimeoutMS is -not deprecated`_. - -Command Execution -~~~~~~~~~~~~~~~~~ - -If ``timeoutMS`` is set, drivers MUST append a ``maxTimeMS`` field to -commands executed against a MongoDB server using the ``minRoundTripTime`` field of -the selected server. Note that this value MUST be retrieved during server -selection using the ``servers`` field of the same `TopologyDescription -<../server-discovery-and-monitoring/server-discovery-and-monitoring.rst#TopologyDescription>`__ -that was used for selection before the selected server's description can be -modified. Otherwise, drivers may be subject to a race condition where a -server is reset to the default description (e.g. due to an error in the -monitoring thread) after it has been selected but before the RTT is -retrieved. - -If the ``minRoundTripTime`` is less than the remaining timeoutMS, -the value of this field MUST be ``remaining timeoutMS - minRoundTripTime``. -If not, drivers MUST return a timeout error without -attempting to send the message to the server. This is done to ensure that an -operation is not routed to the server if it will likely fail with a socket -timeout as that could cause connection churn. The ``maxTimeMS`` field MUST be -appended after all blocking work is complete. - -After wire message construction, drivers MUST check for timeout before -writing the message to the server. If the timeout has expired or the amount -of time remaining is less than the selected server's minimum RTT, -drivers MUST return the connection to the pool and raise a timeout exception. -Otherwise, drivers MUST set the connection’s write timeout to the remaining -``timeoutMS`` value before writing a message to the server. After the write -is complete, drivers MUST check for timeout expiration before reading the -server’s response. If the timeout has expired, the connection MUST be closed -and a timeout exception MUST be propagated to the application. If it has not, -drivers MUST set the connection’s read timeout to the remaining ``timeoutMS`` -value. The timeout MUST apply to the aggregate of all reads done to receive a -server response, not to individual reads. If any read or write calls on the -socket fail with a timeout, drivers MUST transform the error into the new -timeout exception as described in the `Error Transformations`_ section. - -If ``timeoutMS`` is not set and support for ``socketTimeoutMS`` has not been -removed, drivers MUST honor ``socketTimeoutMS`` as the timeout for socket -reads and writes. - -See `maxTimeMS accounts for server RTT`_. - -Batching -~~~~~~~~ - -If an operation must be sent to the server in multiple batches (e.g. -``collection.bulkWrite()``), the ``timeoutMS`` option MUST apply to the -entire operation, not to each individual batch. - -Retryability -~~~~~~~~~~~~ - -If an operation requires a retry per the retryable reads or writes -specifications and ``timeoutMS`` is set, drivers MUST -retry operations as many times as possible before the timeout expires or a -retry attempt returns a non-retryable error. Once the timeout expires, a -timeout error MUST be raised. - -See `Why don’t drivers use backoff/jitter between retry attempts?`_. - -Client Side Encryption -~~~~~~~~~~~~~~~~~~~~~~ - -If automatic client-side encryption or decryption is enabled, the remaining -``timeoutMS`` value MUST be used as the ``timeoutMS`` when executing -``listCollections`` commands to retrieve collection schemas, ``find`` -commands to get data from the key vault, and any commands against -mongocryptd. It MUST also be used as the request timeout for HTTP requests -against KMS servers to decrypt data keys. When sending a command to -mongocryptd, drivers MUST NOT append a ``maxTimeMS`` field. This is to ensure -that a ``maxTimeMS`` field can be safely appended to the command after it has -been marked by mongocryptd and encrypted by libmongocrypt. To determine -whether or not the server is a mongocryptd, drivers MUST check that the -``iscryptd`` field in the server's description is ``true``. - -For explicit encryption and decryption, the ``ClientEncryptionOpts`` options -type used to construct `ClientEncryption -<../client-side-encryption/client-side-encryption.rst#clientencryption>`_ -instances MUST support a new ``timeoutMS`` option, which specifies the timeout -for all operations executed on the ``ClientEncryption`` object. - -See `maxTimeMS is not added for mongocryptd`_. - -Background Connection Pooling ------------------------------ - -Connections created as part of a connection pool’s ``minPoolSize`` -maintenance routine MUST use ``connectTimeoutMS`` as the timeout for -connection establishment. After the connection is established, if -``timeoutMS`` is set at the MongoClient level, it MUST be used as the timeout -for all commands sent as part of the MongoDB or authentication handshakes. -The timeout MUST be refreshed after each command. These commands MUST set -timeouts per the `Command Execution`_ section. If ``timeoutMS`` is not set, -drivers MUST continue to honor ``socketTimeoutMS`` as the socket timeout for -handshake and authentication commands. - -Server Monitoring ------------------ - -Drivers MUST NOT use ``timeoutMS`` for commands executed by the server -monitoring and RTT calculation threads. - -See `Monitoring threads do not use timeoutMS`_. - -Cursors -------- - -For operations that create cursors, ``timeoutMS`` can either cap the lifetime -of the cursor or be applied separately to the original operation and all -``next`` calls. To support both of these use cases, these operations MUST -support a ``timeoutMode`` option. This option is an enum with possible values -``CURSOR_LIFETIME`` and ``ITERATION``. The default value depends on the type -of cursor being created. Drivers MUST error if ``timeoutMode`` is set and -``timeoutMS`` is not. - -When applying the ``timeoutMS`` option to ``next`` calls on cursors, drivers -MUST ensure it applies to the entire call, not individual commands. For -drivers that send ``getMore`` requests in a loop when iterating tailable -cursors, the timeout MUST apply to the totality of all ``getMore``’s, not to -each one individually. If a resume is required for a ``next`` call on a -change stream, the timeout MUST apply to the entirety of the initial -``getMore`` and all commands sent as part of the resume attempt. - -For ``close`` methods, drivers MUST allow ``timeoutMS`` to be overridden if -doing so is possible in the language. If explicitly set for the operation, -it MUST be honored. Otherwise, if ``timeoutMS`` was applied to the operation -that created the cursor, it MUST be refreshed for the ``killCursors`` command -if one is required. Note that this means ``timeoutMS`` will be refreshed for -the ``close`` call even if the cursor was created with a ``timeoutMode`` of -``CURSOR_LIFETIME`` and the timeout associated with the cursor has expired. -The calculated timeout MUST apply to explicit ``close`` methods that can be -invoked by users as well as implicit destructors that are automatically -invoked when exiting resource blocks. - -See `Cursor close() methods refresh timeoutMS`_. - -Non-tailable Cursors -~~~~~~~~~~~~~~~~~~~~ - -For non-tailable cursors, the default value of ``timeoutMode`` is -``CURSOR_LIFETIME``. If ``timeoutMS`` is set, drivers MUST apply it to the -original operation and the lifetime of the created cursor. For example, if a -``find`` is executed at time ``T``, the ``find`` and all ``getMore``’s on the -cursor must finish by time ``T + timeoutMS``. When executing ``next`` calls -on the cursor, drivers MUST use the remaining timeout as the ``timeoutMS`` -value for the operation but MUST NOT append a ``maxTimeMS`` field to -``getMore`` commands. If there are documents remaining in a previously -retrieved batch, the ``next`` method MUST return them even if the timeout has -expired and MUST only return a timeout error if a ``getMore`` is required. - -If ``timeoutMode`` is set to ``ITERATION``, drivers MUST raise a client-side -error if the operation is an ``aggregate`` with a ``$out`` or ``$merge`` -pipeline stage. If the operation is not an ``aggregate`` with ``$out`` or -``$merge``, drivers MUST honor the ``timeoutMS`` option for the initial -command but MUST NOT append a ``maxTimeMS`` field to the command sent to the -server. After the operation has executed, the original ``timeoutMS`` value -MUST also be applied to each ``next`` call on the created cursor. Drivers -MUST NOT append a ``maxTimeMS`` field to ``getMore`` commands. - -See `Non-tailable cursor behavior`_. - -Tailable Cursors -~~~~~~~~~~~~~~~~ - -Tailable cursors only support the ``ITERATION`` value for the ``timeoutMode`` -option. This is the default value and drivers MUST error if the option is set -to ``CURSOR_LIFETIME``. - -Tailable non-awaitData Cursors -`````````````````````````````` - -If ``timeoutMS`` is set, drivers MUST apply it separately to the original -operation and to all ``next`` calls on the resulting cursor but MUST NOT -append a ``maxTimeMS`` field to any commands. - -Tailable awaitData Cursors -`````````````````````````` - -If ``timeoutMS`` is set, drivers MUST apply it to the original operation. -Drivers MUST also apply the original ``timeoutMS`` value to each ``next`` -call on the resulting cursor but MUST NOT use it to derive a ``maxTimeMS`` -value for ``getMore`` commands. Helpers for operations that create tailable -awaitData cursors MUST also support the ``maxAwaitTimeMS`` option. Drivers -MUST error if this option is set, ``timeoutMS`` is set to a non-zero value, -and ``maxAwaitTimeMS`` is greater than or equal to ``timeoutMS``. If this -option is set, drivers MUST use it as the ``maxTimeMS`` field on ``getMore`` -commands. - -See `Tailable cursor behavior`_ for rationale regarding both non-awaitData -and awaitData cursors. - -Change Streams -~~~~~~~~~~~~~~ - -Driver ``watch`` helpers MUST support both ``timeoutMS`` and -``maxAwaitTimeMS`` options. Drivers MUST error if ``maxAwaitTimeMS`` is set, -``timeoutMS`` is set to a non-zero value, and ``maxAwaitTimeMS`` is greater -than or equal to ``timeoutMS``. These helpers MUST NOT support the -``timeoutMode`` option as change streams are an abstraction around -tailable-awaitData cursors, so they implicitly use ``ITERATION`` mode. If -set, drivers MUST apply the ``timeoutMS`` option to the initial ``aggregate`` -operation. Drivers MUST also apply the original ``timeoutMS`` value to each -``next`` call on the change stream but MUST NOT use it to derive a -``maxTimeMS`` field for ``getMore`` commands. If the ``maxAwaitTimeMS`` -option is set, drivers MUST use it as the ``maxTimeMS`` field on ``getMore`` -commands. - -If a ``next`` call fails with a timeout error, drivers MUST NOT invalidate -the change stream. The subsequent ``next`` call MUST perform a resume attempt -to establish a new change stream on the server. Any errors from the -``aggregate`` operation done to create a new change stream MUST be propagated -to the application. Drivers MUST document that users can either call ``next`` -again or close the existing change stream and create a new one if a previous -``next`` call times out. The documentation MUST suggest closing and -re-creating the stream with a higher timeout if the timeout occurs before any -events have been received because this is a signal that the server is timing -out before it can finish processing the existing oplog. - -See `Change stream behavior`_. - -Sessions --------- - -The `SessionOptions <../sessions/driver-sessions.rst#mongoclient-changes>`_ -used to construct explicit `ClientSession -<../sessions/driver-sessions.rst#clientsession>`_ instances MUST accept a new -``defaultTimeoutMS`` option, which specifies the ``timeoutMS`` value for the -following operations executed on the session: - -#. commitTransaction -#. abortTransaction -#. withTransaction -#. endSession - -If this option is not specified for a ``ClientSession``, it MUST inherit the -``timeoutMS`` of its parent MongoClient. - -Session checkout -~~~~~~~~~~~~~~~~ - -As noted in `Blocking Sections for Operation Execution`_, implicit session -checkout can be considered a blocking process for some drivers. Such drivers -MUST apply the remaining ``timeoutMS`` value to this process when executing -an operation. For explicit session checkout, drivers MUST apply the -``timeoutMS`` value of the MongoClient to the ``startSession`` call if set. -Drivers MUST NOT allow users to override ``timeoutMS`` for ``startSession`` -operations. - -See `timeoutMS cannot be overridden for startSession calls`_. - -Convenient Transactions API -~~~~~~~~~~~~~~~~~~~~~~~~~~~ - -If ``timeoutMS`` is set, drivers MUST apply it to the entire -``withTransaction`` call. To propagate the timeout to the user-supplied -callback, drivers MUST store the timeout as a field on the ClientSession -object. This field SHOULD be private to ensure that a user can not modify it -while a ``withTransaction`` call is in progress. Drivers that cannot make -this field private MUST signal that the field should not be accessed or -modified by users if there is an idiomatic way to do so in the language (e.g. -underscore-prefixed variable names in Python) and MUST document that -modification of the field can cause unintended correctness issues for -applications. Drivers MUST document that the remaining timeout will not be -applied to callback operations that do not use the ClientSession. Drivers -MUST also document that overridding ``timeoutMS`` for operations executed -using the explict session inside the provided callback will result in a -client-side error, as defined in `Validation and Overrides`_. If the callback -returns an error and the transaction must be aborted, drivers MUST refresh -the ``timeoutMS`` value for the ``abortTransaction`` operation. - -If ``timeoutMS`` is not set, drivers MUST continue to exhibit the existing -120 second timeout behavior. Drivers MUST NOT change existing implementations -to use ``timeoutMS=120000`` for this case. - -See `withTransaction communicates timeoutMS via ClientSession`_ and -`withTransaction refreshes the timeout for abortTransaction`_. - -GridFS API ----------- - -GridFS buckets MUST inherit ``timeoutMS`` from their parent MongoDatabase -instance and all methods in the GridFS Bucket API MUST support the -``timeoutMS`` option. For methods that create streams (e.g. -``open_upload_stream``), the option MUST cap the lifetime of the entire -stream. This MUST include the time taken by any operations executed during -stream construction, reads/writes, and close/abort calls. For example, if a -stream is created at time ``T``, the final ``close`` call on the stream MUST -finish all blocking work before time ``T + timeoutMS``. Methods that interact -with a user-provided stream (e.g. ``upload_from_stream``) MUST use -``timeoutMS`` as the timeout for the entire upload/download operation. If the -user-provided streams do not support timeouts, drivers MUST document that the -timeout for these methods may be breached if calls to interact with the -stream take longer than the remaining timeout. If ``timeoutMS`` is set, all -cursors created for GridFS API operations MUST internally set the -``timeoutMode`` option to ``CURSOR_LIFETIME``. - -See `GridFS streams behavior`_. - -RunCommand ----------- - -The behavior of ``runCommand`` is undefined if the provided command document -includes a ``maxTimeMS`` field and the ``timeoutMS`` option is set. Drivers -MUST document the behavior of ``runCommand`` for this case and MUST NOT -attempt to check the command document for the presence of a ``maxTimeMS`` -field. - -See `runCommand behavior`_. - -Test Plan -========= - -See the `README.rst -`__ -in the tests directory. - -Motivation for Change -===================== - -Users have many options to set timeouts for various parts of operation -execution including, but not limited to, ``serverSelectionTimeoutMS``, -``socketTimeoutMS``, ``connectTimeoutMS``, ``maxTimeMS``, and ``wTimeoutMS``. -As a result, users are often unsure which timeout to use. Because some of -these timeouts are additive, it is difficult to set a combination which -ensures control will be returned to the user after a specified amount of -time. To make timeouts more intuitive, changes are required to the drivers -API to deprecate some of the existing timeouts and add a new one to specify -the maximum execution time for an entire operation from start to finish. - -In addition, automatically retrying reads and writes that failed due to -transient network blips or planned maintenance scenarios has improved -application resiliency but the original behavior of only retrying once still -allowed some errors to be propagated to applications. Supporting a timeout -for an entire operation allows drivers to retry operations multiple times -while still guaranteeing that an application can get back control once the -specified amount of time has elapsed. - -Design Rationale -================ - -timeoutMS cannot be changed to unset once it’s specified --------------------------------------------------------- - -If ``timeoutMS`` is specified at any level, it cannot be later changed to -unset at a lower level. For example, a user cannot do: - -.. code:: python - - client = MongoClient(uri, timeoutMS=1000) - db = client.database("foo", timeoutMS=None) - -This is because drivers return existing exception types if ``timeoutMS`` is -not specified, but will return new exception types and use new timeout -behaviors if it is. Once the user has opted into this behavior, we should not -allow them to opt out of it at a lower level. If a user wishes to set the -timeout to infinite for a specific database, collection, or operation, they -can explicitly set ``timeoutMS`` to 0. - -serverSelectionTimeoutMS is not deprecated ------------------------------------------- - -The original goal of the project was to expose a single timeout and deprecate -all others. This was not possible, however, because executing an operation -consists of two distinct parts. The first is selecting a server and checking -out a connection from its pool. This should have a default timeout because -failure to do this indicates that the deployment is not in a healthy state or -that there was a configuration error which prevents the driver from -successfully connecting. The second is server-side operation execution, which -cannot have a default timeout. Some operations finish in a few milliseconds, -while others can run for many hours. Adding a default would inevitably break -applications. To accomplish both of these goals, ``serverSelectionTimeoutMS`` -was preserved and is used to timeout the client-side section of operation -execution. - -connectTimeoutMS is not deprecated ----------------------------------- - -Similar to the reasoning for not deprecating ``serverSelectionTimeoutMS``, -socket establishment should have a default timeout because failure to create -a socket likely means that the target server is not healthy or there is a -network issue. To accomplish this, the ``connectTimeoutMS`` option is not -deprecated by this specification. Drivers also use ``connectTimeoutMS`` to -derive a socket timeout for monitoring connections, which are not subject to -timeoutMS. - -timeoutMS overrides deprecated timeout options ----------------------------------------------- - -Applying both ``timeoutMS`` and a deprecated timeout option like -``socketTimeoutMS`` at the same time would lead to confusing semantics that -are difficult to document and understand. When first writing this -specification, we considered having drivers error in this situation to catch -mismatched timeouts as early as possible. However, because ``timeoutMS`` can -be set at any level, this behavior could lead to unanticipated runtime errors -if an application set ``timeoutMS`` for a specific operation and the -MongoClient used in production was configured with a deprecated timeout -option. To have clear semantics and avoid unexpected errors in applications, we -decided that ``timeoutMS`` should override deprecated timeout options. - -maxTimeMS is not added for mongocryptd --------------------------------------- - -The mongocryptd server annotates the provided command to indicate encryption -requirements and returns the marked up result. If the command sent to -mongocryptd contained ``maxTimeMS``, the final command sent to MongoDB would -contain two ``maxTimeMS`` fields: one added by the regular MongoClient and -another added by the mongocryptd client. To avoid this complication, drivers -do not add this field when sending commands to mongocryptd at all. Doing so -does not sacrifice any functionality because mongocryptd always runs on -localhost and does not perform any blocking work, so execution or network -timeouts cannot occur. - -maxTimeMS accounts for server RTT ---------------------------------- - -When constructing a command, drivers use the ``timeoutMS`` option to derive a -value for the ``maxTimeMS`` command option and the socket timeout. The full -time to round trip a command is (network RTT + server-side execution time). -If both ``maxTimeMS`` and socket timeout were set to the same value, the -server would never be able to respond with a ``MaxTimeMSExpired`` error -because drivers would hit the socket timeout first and close the connection. -This would lead to connection churn if the specified timeout is too low. To -allow the server to gracefully error and avoid churn, drivers must account -for the network round trip in the ``maxTimeMS`` calculation. - -Monitoring threads do not use timeoutMS ---------------------------------------- - -Using ``timeoutMS`` in the monitoring and RTT calculation threads would -require another special case in the code that derives ``maxTimeMS`` from -``timeoutMS`` because the awaitable ``hello`` requests sent to 4.4+ -servers already have a ``maxAwaitTimeMS`` field. Adding ``maxTimeMS`` also -does not help for non-awaitable ``hello`` commands because we expect them -to execute quickly on the server. The Server Monitoring spec already mandates -that drivers set and dynamically update the read/write timeout of the -dedicated connections used in monitoring threads, so we rely on that to time -out commands rather than adding complexity to the behavior of ``timeoutMS``. - -runCommand behavior -------------------- - -The behavior of runCommand varies across drivers. If the provided command -document includes a ``maxTimeMS`` field and the ``timeoutMS`` option is set, -some drivers would overwrite the ``maxTimeMS`` field with the value derived -from ``timeoutMS``, while others would append a second ``maxTimeMS`` field, -which would cause a server error on versions 3.4+. To be prescriptive, we -could mandate that drivers raise a client-side error in this case, but this -would require a potentially expensive lookup in the command document. To -avoid this additional cost, drivers are only required to document the -behavior and suggest that ``timeoutMS`` be used instead of including a manual -``maxTimeMS`` field. - -Why don’t drivers use backoff/jitter between retry attempts? ------------------------------------------------------------- - -Earlier versions of this specification proposed adding backoff and/or jitter -between retry attempts to avoid connection storming or overloading the -server, but we later deemed this unnecessary. If multiple concurrent -operations select the same server for a retry and its connection pool is -empty, we rely on the ``maxConnecting`` parameter introduced in DRIVERS-781 -to rate limit new connection attempts, which mitigates the risk of connection -storms. Even if the new server has enough connections in its pool to service -the operations, recent server versions do very little resource-intensive work -until execution reaches the storage layer, which is already guarded by -read/write tickets, so we don’t expect the server to be overwhelmed. If we -later decide that adding jitter would be useful, it may be easier to do so in -the server itself via a ticket-based admission system earlier in the -execution stack. - -Cursor close() methods refresh timeoutMS ----------------------------------------- - -If a cursor times out client-side (e.g. a non-tailable cursor created with -``timeoutMode=CURSOR_LIFETIME``), it’s imperative that drivers make a -good-faith effort to close the server-side cursor even though the timeout has -expired because failing to do so would leave resources open on the server for -a potentially long time. It was decided that ``timeoutMS`` will be refreshed -for ``close`` operations to allow the cursor to be killed server-side. - -Non-tailable cursor behavior ----------------------------- - -There are two usage patterns for non-tailable cursors. The first is to read -documents from a cursor into an iterable object, either by explicitly -iterating the cursor in a loop or using a language construct like Python list -comprehensions. To supply a timeout for the entire process, drivers use -``timeoutMS`` to cap the execution time for the initial command and all -required ``getMore``’s. This use case also matches the server behavior; if -``maxTimeMS`` is set for an operation that creates a non-tailable cursor, the -server will use the time limit to cap the total server-side execution time -for future ``getMore``’s. Because this type of usage matches the server -behavior and is the more common case, this is the default behavior. - -The second use case is batch processing, where the user takes advantage of -the lazy nature of cursors to process documents from a large collection. In -this case, the user does not want all documents from the collection to be in -an array because that would require too much memory. To accommodate this use -case, drivers support a new ``timeoutMode`` option. Users can set the value -for this option to ``ITERATION`` to have ``timeoutMS`` apply to the original -command and then separately to each ``next`` call. When this option is used, -drivers do not set ``maxTimeMS`` on the initial command to avoid capping the -cursor lifetime in the server. - -Tailable cursor behavior ------------------------- - -Once a tailable cursor is created, it conceptually lives forever. Therefore, -it only makes sense to support ``timeoutMode=ITERATION`` for these cursors -and drivers error if ``timeoutMode=CURSOR_LIFETIME`` is specified. - -There are two types of tailable cursors. The first, tailable non-awaitData -cursors, support ``maxTimeMS`` for the original command but not for any -``getMore`` requests. However, setting ``maxTimeMS`` on the original command -also incorrectly caps the server-side execution time for future ``getMore``’s -(`SERVER-51153 `__). This is -undesirable behavior because it does not match the guarantees made by -``timeoutMode=ITERATION``. To work around this, drivers honor ``timeoutMS`` -for both the original operation and all ``getMore``’s but only use it to -derive client-side timeouts and do not append a ``maxTimeMS`` field to any -commands. The server-side execution time is enforced via socket timeouts. - -The second type is tailable awaitData cursors. The server supports the -``maxTimeMS`` option for the original command. For ``getMore``’s, the option -is supported, but instead of limiting the server-side execution time, it -specifies how long the server should wait for new data to arrive if it -reaches the end of the capped collection and the batch is still empty. If no -new data arrives within that time limit, the server will respond with an -empty batch. For these cursors, drivers support both the ``timeoutMS`` and -``maxAwaitTimeMS`` options. The ``timeoutMS`` option is used to derive -client-side timeouts, while the ``maxAwaitTimeMS`` option is used as the -``maxTimeMS`` field for ``getMore`` commands. These values have distinct -meanings, so supporting both yields a more robust, albeit verbose, API. -Drivers error if ``maxAwaitTimeMS`` is greater than or equal to ``timeoutMS`` -because in that case, ``getMore`` requests would not succeed if the batch was -empty: the server would wait for ``maxAwaitTimeMS``, but the driver would -close the socket after ``timeoutMS``. - -Change stream behavior ----------------------- - -Change streams internally behave as tailable awaitData cursors, so the -behavior of the ``timeoutMS`` option is the same for both. The main -difference is that change streams are resumable and drivers automatically -perform resume attempts when they encounter transient errors. This allows -change streams to be resilient to timeouts. If ``timeoutMS`` expires during a -next call, drivers can’t auto-resume, but they can make sure the change -stream is not invalidated so the user can call next again. In this case, the -subsequent call would perform the resume without doing a ``getMore`` first. - -withTransaction communicates timeoutMS via ClientSession --------------------------------------------------------- - -Because the ``withTransaction`` API doesn’t allow drivers to plumb down the -remaining timeout into the user-provided callback, this spec requires the -remaining timeout to be stored on the ClientSession. Operations in the -callback that run under that ClientSession can then extract the timeout from -the session and apply it. To avoid confusing validation semantics, operations -error if there is a timeout on the session but also an overridden timeout for -the operation. It’s possible that the ability to communicate timeouts for a -block of operations via a ClientSession is useful as a general purpose API, -but we’ve decided to make it private until there are other known use cases. - -withTransaction refreshes the timeout for abortTransaction ----------------------------------------------------------- - -If the user-provided callback to ``withTransaction`` times out, it could -leave a transaction running on the server. It’s imperative that drivers make -an effort to abort the open transaction because failing to do so could result -in the collections and databases affected by the transaction being locked for -a long period of time, which could cause applications to stall. Because -``timeoutMS`` has expired before drivers attempt to abort the transaction, we -require drivers to refresh it and apply the original value to the execution -of the ``abortTransaction`` operation. This can cause the entire -``withTransaction`` call to take up to ``2*timeoutMS``, but it was decided -that this risk is worthwhile given the importance of transaction cleanup. - -GridFS streams behavior ------------------------ - -Streams created by GridFS API operations (e.g. by ``open_upload_stream`` and -``open_download_stream``) present a challenge for this specification. These -types of streams execute multiple operations, but there can be artificial -gaps between operations if the application does not invoke the stream -functions for long periods of time. Generally, we expect users to upload or -download an entire file as quickly as possible, so we decided to have -``timeoutMS`` cap the lifetime of the created stream. The other option was to -apply the entire ``timeoutMS`` value to each operation executed by the -stream, but streams perform many hidden operations, so this approach could -cause an upload/download to take much longer than expected. - -timeoutMS cannot be overridden for startSession calls ------------------------------------------------------ - -In general, users can override ``timeoutMS`` at the level of a single -operation. The ``startSession`` operation, however, only inherits -``timeoutMS`` from the MongoClient and does not allow the option to be -overridden. This was a consious API design decision because drivers are -moving towards only supporting MongoDB versions 3.6 and higher, so sessions -will always be supported. Adding an override for ``startSession`` would -introduce a new knob and increase the API surface of drivers without providing -a significant benefit. - - -Drivers use minimum RTT to short circuit operations ---------------------------------------------------- - -A previous version of this spec used the 90th percentile RTT to short -circuit operations that might otherwise fail with a socket timeout. -We decided to change this logic to avoid canceling operations that may -have a high chance of succeeding and also remove a dependency on t-digest. -Instead, drivers use the minimum RTT from the last 10 samples, or 0 until -at least 2 samples have been recorded. - -Future work -=========== - -Modify GridFS streams behavior via new options ----------------------------------------------- - -As explained in the design rationale, drivers use ``timeoutMS`` to cap the -entire lifetime of streams created by GridFS operations. If we find that users -are often encountering timeout errors when using these APIs due to the time -spent during non-MongoDB operations (e.g. streaming data read from a GridFS -stream into another data store), we could consider toggling GridFS behavior -via an option similiar to ``timeoutMode`` for cursors. To avoid -backwards-breaking behavioral changes, the default would continue to cap the -stream lifetime but there could be another mode that refreshes the timeout -for each database operation. This would mimic using -``timeoutMode=ITERATION`` for cursors. - - -Changelog -========= - -:2022-10-05: Remove spec front matter. -:2022-01-19: Initial version. -:2022-11-17: Use minimum RTT for maxTimeMS calculation instead of 90th percentile RTT.