Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

DRIVERS-3009, DRIVERS-1447: Updates to find operations #1691

Merged
merged 12 commits into from
Nov 13, 2024

Conversation

alcaeus
Copy link
Member

@alcaeus alcaeus commented Oct 29, 2024

Please complete the following before merging:

  • Update changelog.
  • Make sure there are generated JSON files from the YAML test files.
  • Test changes in at least one language driver (PHP)
  • Test these changes against all server versions and topologies (including standalone, replica set, sharded
    clusters, and serverless).

This PR introduces changes to the CRUD spec, covering multiple tickets:

  • DRIVERS-3009: Add a formal definition for Collection::findOne, especially in order to specify the limit behaviour and define available options
  • DRIVERS-1447: Clarify how to handle limit and batchSize options when they are set to the same value to avoid an unnecessary cursor being created on the server.

@alcaeus alcaeus requested a review from jmikola October 29, 2024 08:23
@alcaeus alcaeus requested a review from a team as a code owner October 29, 2024 08:24
@@ -702,6 +714,194 @@ class FindOptions {
*/
let: Optional<Document>;
}

class FindOneOptions {
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This duplicates a significant amount of options from FindOptions. There are two alternatives:

  • Have FindOptions extend FindOneOptions and add options not supported by findOne
  • Introduce CommonFindOptions, have FindOptions and FindOneOptions extend that class and add their own.

If we want to avoid duplication, I'd prefer option one.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think this matters, but I know some languages actually base their own model classes after these definitions within the spec. Maybe better to ask them, although I suppose that only matters if they actually intend on implementing findOne() (IIRC, C# does not).

I hate that we define all of these in giant code blocks. Makes it impossible to deep link or add meaningful prose text. Logically, we wouldn't need to document anything here apart from saying it's the same as FindOptions with limit and batchSize removed (and maybe some other options specific to multiple results).

Having said all that, I agree with:

If we want to avoid duplication, I'd prefer option one.

Otherwise, CommonFindOptions and FindOneOptions would just be identical anyway.

I pray that no users of a driver ever have to see a FindOptions model extending FindOneOptions. That should be an implementation detail.

Copy link
Member

@ShaneHarvey ShaneHarvey Oct 29, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I really dislike the duplication here. Here's what pymongo's docs say for find_one:

All arguments to :meth:find are also valid arguments for :meth:find_one, although any limit argument will be ignored.

What if we say something similar here instead of duplicating?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Typescript has syntax for this that we use in our driver:

For example:

type FindOneOptions = Omit<FindOptions, 'timeoutMode'>

Not suggesting we use that language specific syntax here, but I think it would be fine to psuedo declare FindOneOptions are equal to FindOptions except for X Y Z

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@alcaeus What is your thinking here? Seems like we have a preference for not duplicating the options, but as for exactly how to write it, there are a couple of choices here.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

After thinking about a good solution, I figured that solving this for find/findOne is only half the story, as there's lots of other duplication (e.g. UpdateOptions and ReplaceOptions). I filed DRIVERS-3038 to solve the duplication issue and also remove the code blocks: they are unnecessarily prescriptive and prevent us from properly linking to various options in the document. As such, I'd defer making changes to this for the time being.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Something for next spec-fest perhaps :) fine by me

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can't we just go with one of the proposed simple solutions for now? We can fix the find/findOne duplication in this PR and then use DRIVERS-3038 to fix the remaining issues.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Changed to the Omit syntax suggested by @nbbeeken.

@@ -2474,6 +2677,8 @@ aforementioned allowance in the SemVer spec.

## Changelog

- 2024-10-29: Defined `findOne` operation as optional.
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"as optional" is tentative here pending an outcome of a poll across drivers. I'm considering making the operation required, but letting drivers choose their timeline for implementing this operation, as findOne is a common use case.

@@ -0,0 +1,60 @@
description: "findOne"
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These tests heavily borrow from the find tests. I've decided to only explicitly check for limit and the absence of batchSize in the outgoing command. I don't think there are any other special cases we need to test.

@alcaeus alcaeus force-pushed the drivers-3009-findone-updates branch 3 times, most recently from b25faeb to 8594353 Compare October 29, 2024 13:22
@alcaeus alcaeus changed the title DRIVERS-3009: Define findOne operation DRIVERS-3009, DRIVERS-1447: Updates to find operations Oct 29, 2024
Comment on lines 983 to 986
When users specify both `limit` and `batchSize` options with the same value, the server returns all results in the
first batch, but still leaves an open cursor that needs to be closed using the `killCursors` command. To avoid this,
drivers MUST send a value of `limit + 1` for `batchSize` in the resulting `find` command. This eliminates the open
cursor issue.
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unlike suggested in the ticket, we can't just omit the batchSize option when limit <= batchSize. Omitting the batch size makes us fall back to the default batch size, which may be less than the limit requested by the user, defeating the expectation that all results are returned in a single batch. The only workaround for this is to set a batch size higher than limit to get the server to close the cursor immediately before returning results.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, this is what you mentioned in Slack about limit + 1. I missed that it only applied to find and that you weren't discussing findOne (which can omit batchSize entirely).

* as it would be impossible to be forwards and backwards compatible. Let the server
* handle the validation.
*
* @see https://www.mongodb.com/docs/manual/core/read-operations-introduction/
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Changed the link to this one - I believe this works fine as an introduction to querying

source/crud/crud.md Outdated Show resolved Hide resolved
@@ -702,6 +714,194 @@ class FindOptions {
*/
let: Optional<Document>;
}

class FindOneOptions {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think this matters, but I know some languages actually base their own model classes after these definitions within the spec. Maybe better to ask them, although I suppose that only matters if they actually intend on implementing findOne() (IIRC, C# does not).

I hate that we define all of these in giant code blocks. Makes it impossible to deep link or add meaningful prose text. Logically, we wouldn't need to document anything here apart from saying it's the same as FindOptions with limit and batchSize removed (and maybe some other options specific to multiple results).

Having said all that, I agree with:

If we want to avoid duplication, I'd prefer option one.

Otherwise, CommonFindOptions and FindOneOptions would just be identical anyway.

I pray that no users of a driver ever have to see a FindOptions model extending FindOneOptions. That should be an implementation detail.

Comment on lines 983 to 986
When users specify both `limit` and `batchSize` options with the same value, the server returns all results in the
first batch, but still leaves an open cursor that needs to be closed using the `killCursors` command. To avoid this,
drivers MUST send a value of `limit + 1` for `batchSize` in the resulting `find` command. This eliminates the open
cursor issue.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, this is what you mentioned in Slack about limit + 1. I missed that it only applied to find and that you weren't discussing findOne (which can omit batchSize entirely).

source/crud/crud.md Show resolved Hide resolved
source/crud/tests/unified/find.yml Show resolved Hide resolved
source/crud/tests/unified/findOne.yml Show resolved Hide resolved
source/crud/crud.md Outdated Show resolved Hide resolved
@@ -702,6 +714,194 @@ class FindOptions {
*/
let: Optional<Document>;
}

class FindOneOptions {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Typescript has syntax for this that we use in our driver:

For example:

type FindOneOptions = Omit<FindOptions, 'timeoutMode'>

Not suggesting we use that language specific syntax here, but I think it would be fine to psuedo declare FindOneOptions are equal to FindOptions except for X Y Z

source/crud/crud.md Show resolved Hide resolved
@alcaeus alcaeus force-pushed the drivers-3009-findone-updates branch from 041345a to b4c8e3d Compare November 5, 2024 10:22
@alcaeus alcaeus requested review from nbbeeken and jmikola November 5, 2024 12:42
@@ -702,6 +714,194 @@ class FindOptions {
*/
let: Optional<Document>;
}

class FindOneOptions {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@alcaeus What is your thinking here? Seems like we have a preference for not duplicating the options, but as for exactly how to write it, there are a couple of choices here.

@alcaeus alcaeus requested a review from nbbeeken November 6, 2024 07:41
@nbbeeken nbbeeken requested a review from ShaneHarvey November 6, 2024 15:21
@alcaeus alcaeus force-pushed the drivers-3009-findone-updates branch from b4c8e3d to 734e40a Compare November 7, 2024 12:30
source/crud/crud.md Outdated Show resolved Hide resolved
source/crud/crud.md Outdated Show resolved Hide resolved
source/crud/crud.md Show resolved Hide resolved
@@ -2496,6 +2513,8 @@ aforementioned allowance in the SemVer spec.

## Changelog

- 2024-11-05: Define `findOne` operation as optional.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You can also note that you're adding guidance on limit and batchSize for find operations.

Copy link
Member

@jmikola jmikola left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM with @see and changelog updates.

@alcaeus alcaeus force-pushed the drivers-3009-findone-updates branch from 734e40a to 8780c53 Compare November 13, 2024 12:01
@alcaeus alcaeus merged commit 2362d1a into mongodb:master Nov 13, 2024
5 checks passed
@alcaeus alcaeus deleted the drivers-3009-findone-updates branch November 13, 2024 12:05
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants