Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

DRIVERS-1408 - Add guidance on adding _id fields to documents to CRUD spec #1688

Merged
merged 6 commits into from
Oct 31, 2024

Conversation

NoahStapp
Copy link
Contributor

Python implementation: mongodb/mongo-python-driver#1976

The new prose test only verifies that drivers correctly implement prepending _id for insert_one. Is it worthwhile to verify every other insert operation (insert_many, bulk_write, client_bulk_write) as well?

@NoahStapp NoahStapp requested a review from a team as a code owner October 28, 2024 16:08
@NoahStapp NoahStapp requested review from nbbeeken and removed request for a team October 28, 2024 16:08
@NoahStapp NoahStapp changed the title DRIVERS-1408 - Add guidance on adding _id fields to documents to CRUD… DRIVERS-1408 - Add guidance on adding _id fields to documents to CRUD spec Oct 28, 2024
source/crud/tests/README.md Outdated Show resolved Hide resolved
source/crud/crud.md Outdated Show resolved Hide resolved
source/crud/crud.md Outdated Show resolved Hide resolved
@@ -815,6 +815,12 @@ database-level aggregation will allow users to receive a cursor from these colle

##### Insert, Update, Replace, Delete, and Bulk Writes

###### Generated identifiers

The insert and bulk insert operations described below MUST generate identifiers for all documents that do not already
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Noted that this is already discussed in the client bulk write spec.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Having this behavior discussed here for general CRUD operations makes sense to me, as the client bulk write spec follows from this broader context.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The client bulk write spec stands on its own, except for a reference back to CRUD for modeling unacknowledged write results.

With my last comment, I just meant to acknowledge that no changes were needed to the client bulk write spec since it already addressed _id ordering. This was in the vein of previous PRs like #1644 that required updates to both specs.


Construct a `MongoClient` (referred to as `client`) with
[command monitoring](../../command-logging-and-monitoring/command-logging-and-monitoring.md) enabled to observe
CommandStartedEvents. Perform an `insert` command using `client` and assert that one CommandStartedEvent (referred to as
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Perform an insert command

This is ambiguous, as the prose test could be implementing using the generic command runner, in which case I'd expect no ID generation. I'd suggest changing this to explicitly suggesting running an insertOne operation. You can extend that with insertMany and a bulkWrite if you want full test coverage.

source/crud/tests/README.md Outdated Show resolved Hide resolved
@NoahStapp NoahStapp requested a review from a team as a code owner October 28, 2024 20:16
@NoahStapp NoahStapp requested review from durran, nbbeeken and jmikola and removed request for a team October 28, 2024 20:16
CommandStartedEvents. For each of `insertOne`, client `bulkWrite`, and collection `bulkWrite`, do the following:

- Execute the command with a document that does not contain an `_id` field.
- If possible, capture the wire protocol message (referred to as `request`) of the command.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this only necessary in languages where the ordering is still not deterministic in the CommandStartedEvent's command document?

Copy link
Contributor Author

@NoahStapp NoahStapp Oct 29, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is the preferred way to verify the ordering of the document, as drivers may modify the document between emitting the CommandStartedEvent and the actual wire transfer of the command. For example, the Python driver re-orders _id to be the first field during BSON conversion, which takes place after the CommandStartedEvent would be emitted.

Verifying the order of the actual transmitted payload document ensures that the server receives exactly what we expect it to.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it could potentially be rephrased if wire protocol parsing is difficult for drivers to achieve, but I agree it would be easy to assert _id is the first key in command monitoring and that not actually be the case when the wire message is produced.

I think perhaps for this test; we can encourage drivers to just use command monitoring as the "main path" set of assertions to implement. But we should encourage drivers to check that their wire message key order either matches their command events or that reordering is done when the wire message is built.

So for Node, I would write a test that asserts the JS object that is inspectable on the event (for any command) matches key order in BSON. Python may check that their wire messages are ordered correctly.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If drivers are able to check wire messages directly, they should take that path. Otherwise, they should fall back to using command monitoring, ideally verifying that they don't modify field ordering before sending data over the wire.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree, after all the point here is that the bytes are in an order that the server benefits from, if your command monitoring is a source of truth for such a thing you could/should use it but the real goal is in the wire message.

source/crud/crud.md Outdated Show resolved Hide resolved
source/crud/tests/README.md Show resolved Hide resolved
CommandStartedEvents. For each of `insertOne`, client `bulkWrite`, and collection `bulkWrite`, do the following:

- Execute the command with a document that does not contain an `_id` field.
- If possible, capture the wire protocol message (referred to as `request`) of the command.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it could potentially be rephrased if wire protocol parsing is difficult for drivers to achieve, but I agree it would be easy to assert _id is the first key in command monitoring and that not actually be the case when the wire message is produced.

I think perhaps for this test; we can encourage drivers to just use command monitoring as the "main path" set of assertions to implement. But we should encourage drivers to check that their wire message key order either matches their command events or that reordering is done when the wire message is built.

So for Node, I would write a test that asserts the JS object that is inspectable on the event (for any command) matches key order in BSON. Python may check that their wire messages are ordered correctly.

@alcaeus alcaeus removed their request for review October 31, 2024 09:03
@NoahStapp NoahStapp removed the request for review from durran October 31, 2024 13:30
Copy link
Member

@jmikola jmikola left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM w/ or w/o my suggestion.

source/crud/tests/README.md Outdated Show resolved Hide resolved
@NoahStapp NoahStapp merged commit b607a57 into mongodb:master Oct 31, 2024
4 of 5 checks passed
@NoahStapp NoahStapp deleted the DRIVERS-1408 branch October 31, 2024 16:03
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants