-
Notifications
You must be signed in to change notification settings - Fork 244
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
DRIVERS-2677: clarify when ServerHeartbeatStartedEvent should be emitted #1483
DRIVERS-2677: clarify when ServerHeartbeatStartedEvent should be emitted #1483
Conversation
…tor connection creation
…rted-event-ordering
…of github.com:W-A-James/specifications into DRIVERS-2677/clarify-heartbeat-started-event-ordering
Heartbeat Tests | ||
~~~~~~~~~~~~~~~ | ||
|
||
1. Test that ``ServerHeartbeatStartedEvent`` is emitted after the monitoring socket was created and before the ``hello`` is sent to the server | ||
|
||
#. Create a mock TCP server (example shown below) that pushes a ``client connected`` event to a shared array when a client connects and a ``client hello received`` event when the server receives the client hello and then closes the connection:: | ||
|
||
let events = []; | ||
server = createServer(clientSocket => { | ||
events.push('client connected'); | ||
|
||
clientSocket.on('data', () => { | ||
events.push('client hello received'); | ||
clientSocket.end(); | ||
}); | ||
}); | ||
server.listen(9999); | ||
|
||
#. Create a client with ``serverSelectionTimeoutMS: 500, maxPoolSize: 1, minPoolSize: 0`` and listen to ``ServerHeartbeatStartedEvent`` and ``ServerHeartbeatFailedEvent``, pushing the event name to the same shared array as the mock TCP server. | ||
|
||
#. Attempt to connect client to previously created TCP server, catching the error when the client fails to connect | ||
|
||
#. Assert that the event array has a length of 4 | ||
|
||
#. Assert that the first event is the ``client connected`` event | ||
|
||
#. Assert that the second event is the ``serverHeartbeatStartedEvent`` | ||
|
||
#. Assert that the third event is the ``client hello received`` event | ||
|
||
#. Assert that the fourth event is the ``serverHeartbeatFailedEvent`` | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we think this test will be accurate with respect to the relative ordering of events on the driver side even though we'll be using sockets (albeit locally)?
This seems to be the best approach for now since we don't have more granular logging for when the the monitoring connection is created and when the hello itself is sent.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Seems so to me.
even though we'll be using sockets
What is the implied concern here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I was thinking of whether or not the delay between creating the connection and emitting the ServerHeartbeatStartedEvent
on the client side would be shorter than the time it takes for the mock server to emit the client connected
event as in the following sequence diagram:
client | server |
---|---|
create connection | . |
emit heartbeat started | . |
. | receive connection; emit client connected |
send hello | . |
. | receive hello; emit client hello received ; close connection |
emit heartbeat failed | . |
which would lead the events
buffer to look like this
[
'serverHeartbeatStartedEvent',
'client connected',
'client hello received',
'serverHeartbeatFiledEvent'
]
and report a false failure since the client actually performed the steps in the correct order
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Makes sense. There is a bit of an assumption going on here that there wouldn't be any multithreading or multiprocessing forcing the server's "client connected" to strictly proceed the driver code getting an opportunity to run whatever "connection established" handling it has.
Perhaps the test could divest its core assertions a bit from the suggested implementation. A compiled language may be able to accomplish the same goal if it were to supplant the socket API with a function under its control to get notified of socket construction.
I hope there is enough prior art to make this test accessible:
- https://nodejs.org/docs/latest/api/net.html#class-netserver
- https://docs.python.org/3/library/socketserver.html#socketserver-tcpserver-example
- https://ruby-doc.org/stdlib-2.4.0/libdoc/socket/rdoc/TCPServer.html
- https://docs.oracle.com/javase/8/docs/api/java/net/ServerSocket.html
- https://learn.microsoft.com/en-us/dotnet/api/system.net.sockets.tcplistener?view=net-8.0
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Update: we want the order to be
['serverHeartbeatStarted', 'client connected', 'client hello received', 'serverHeartbeatFailed']
...e/server-discovery-and-monitoring/server-discovery-and-monitoring-logging-and-monitoring.rst
Outdated
Show resolved
Hide resolved
Heartbeat Tests | ||
~~~~~~~~~~~~~~~ | ||
|
||
1. Test that ``ServerHeartbeatStartedEvent`` is emitted after the monitoring socket was created and before the ``hello`` is sent to the server | ||
|
||
#. Create a mock TCP server (example shown below) that pushes a ``client connected`` event to a shared array when a client connects and a ``client hello received`` event when the server receives the client hello and then closes the connection:: | ||
|
||
let events = []; | ||
server = createServer(clientSocket => { | ||
events.push('client connected'); | ||
|
||
clientSocket.on('data', () => { | ||
events.push('client hello received'); | ||
clientSocket.end(); | ||
}); | ||
}); | ||
server.listen(9999); | ||
|
||
#. Create a client with ``serverSelectionTimeoutMS: 500, maxPoolSize: 1, minPoolSize: 0`` and listen to ``ServerHeartbeatStartedEvent`` and ``ServerHeartbeatFailedEvent``, pushing the event name to the same shared array as the mock TCP server. | ||
|
||
#. Attempt to connect client to previously created TCP server, catching the error when the client fails to connect | ||
|
||
#. Assert that the event array has a length of 4 | ||
|
||
#. Assert that the first event is the ``client connected`` event | ||
|
||
#. Assert that the second event is the ``serverHeartbeatStartedEvent`` | ||
|
||
#. Assert that the third event is the ``client hello received`` event | ||
|
||
#. Assert that the fourth event is the ``serverHeartbeatFailedEvent`` | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Seems so to me.
even though we'll be using sockets
What is the implied concern here?
|
||
#. Assert that the third event is the ``client hello received`` event | ||
|
||
#. Assert that the fourth event is the ``serverHeartbeatFailedEvent`` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What's the significance of the "failed" event? It seems like we could just
- test against a standalone
- connect
- confirm the ordering of events (connected, heartbeat started, hello received, heartbeat succeeded)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The failed
event is expected as the mock server doesn't need to be able to properly reply to the hello for this test to prove the ordering so we expect that it is the case that it won't. As a result, we just want to be sure that the driver fails its heartbeat since the monitoring connection will be closed. I'm open to removing that as a requirement for the test, but it seems appropriate to include.
When you say test against a standalone, how would we track when the connection was created and the hello was received when testing against an actual Mongo server?
@@ -138,7 +138,7 @@ Events that MUST be published (with their conditions) are as follows. | |||
* - ``TopologyClosedEvent`` | |||
- When a topology is shut down - this MUST be the last SDAM event fired. | |||
* - ``ServerHeartbeatStartedEvent`` | |||
- Published when the server monitor sends its ``hello`` or legacy hello call to the server. | |||
- Published when the server monitor sends its ``hello`` or legacy hello call to the server. When the monitor is creating a new connection, this event MUST be published just after the socket is created and just before the ``hello`` or legacy hello. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
is this right? I don't think this is what we do in Node. we emit heartbeat started events before establishing a socket (before a call to connect
)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Shane and I are discussing this further, but our initial decision was to do what we currently do in the Node driver, but this does make it so that the the duration
on the first ServerHeartbeatSucceededEvent
would include the time taken to complete the TCP handshake which adds at least another network round trip, artificially inflating the RTT since that gets calculated from the aforementioned duration
field.
How I currently have this written would avoid this, but would require changes in the Node driver among others.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Update: Will change this to expect the ServerHeartbeatStartedEvent
before connection creation. There was confusion on my end as I thought we had to measure the start time as when the ServerHeartbeatStartedEvent
was emitted. On Node's end this will require still require us to change how we implement our monitoring since we currently measure the heartbeat duration starting as mentioned above.
1. Test that ``ServerHeartbeatStartedEvent`` is emitted after the monitoring socket was created and before the ``hello`` is sent to the server | ||
|
||
#. Create a mock TCP server (example shown below) that pushes a ``client connected`` event to a shared array when a client connects and a ``client hello received`` event when the server receives the client hello and then closes the connection:: | ||
|
||
let events = []; | ||
server = createServer(clientSocket => { | ||
events.push('client connected'); | ||
|
||
clientSocket.on('data', () => { | ||
events.push('client hello received'); | ||
clientSocket.end(); | ||
}); | ||
}); | ||
server.listen(9999); | ||
|
||
#. Create a client with ``serverSelectionTimeoutMS: 500`` and listen to ``ServerHeartbeatStartedEvent`` and ``ServerHeartbeatFailedEvent``, pushing the event name to the same shared array as the mock TCP server. | ||
|
||
#. Attempt to connect client to previously created TCP server, catching the error when the client fails to connect | ||
|
||
#. Assert that the event array has the following contents in the same order:: | ||
|
||
['serverHeartbeatStartedEvent', 'client connected', 'client hello received', 'serverHeartbeatFailedEvent'] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since we're concerned with the the fact that the ServerHeartbeatStartedEvent
gets emitted before the connection is created and so we should fail if the connection fails to be established, would it be sufficient for us to just try to connect to a port that has nothing listening on it and expect to get a ServerHeartbeatStartedEvent
followed by a ServerHeartbeatFailedEvent
, or do we want to keep this granularity?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes I think that test would work as well. Would it be easier to implement?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, it would. It wouldn't require the mock server at all.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Revisiting this test; unless a driver can spy on the function that creates the socket, the updated version of this test wouldn't prevent something like this from passing:
let socketOrError = await connect().catch(error => error);
if (isSocket(socketOrError)) {
emitHBStarted()
await runHeartBeat()
} else {
emitHBStarted()
throw socketOrError;
}
// Assume emitHBFailed will be done if any of the above throws
We should roll back to actually using the mock server as that will provide a stronger guarantee on the ordering we want to test here.
@@ -138,7 +138,7 @@ Events that MUST be published (with their conditions) are as follows. | |||
* - ``TopologyClosedEvent`` | |||
- When a topology is shut down - this MUST be the last SDAM event fired. | |||
* - ``ServerHeartbeatStartedEvent`` | |||
- Published when the server monitor sends its ``hello`` or legacy hello call to the server. | |||
- Published when the server monitor sends its ``hello`` or legacy hello call to the server. When the monitor is creating a new connection, this event MUST be published just before the socket is created |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Missing punctuation: created.
If the ``serverConnectionId`` is not yet known at the time this log is emitted, the unstructured form SHOULD be as follows: | ||
|
||
Heartbeat started for {{serverHost}}:{{serverPort}} on connection with driver-generated ID {{driverConnectionId}} in topology with ID | ||
{{topologyId}}. Awaited: {{awaited}} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this documenting behavior that's already tested? Or do we need a new test for this?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd note that this seems unrelated to DRIVERS-2677. Can we open a new drivers ticket for this change?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is new behaviour that needs to be tested, but can't be until a new DRIVERS ticket to add serverConnectionId
to ServerHeartbeatStartedEvent
as an optional field is created. I'm open to removing this until that ticket is filed and in progress.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Cool, I vote we move it out to that ticket to avoid confusion and reduce the scope here.
Heartbeat Tests | ||
~~~~~~~~~~~~~~~ | ||
|
||
1. Test that ``ServerHeartbeatStartedEvent`` is emitted after the monitoring socket was created and before the ``hello`` is sent to the server |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
emitted after the monitoring socket was created
This is the opposite of what the current spec says.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good catch
1. Test that ``ServerHeartbeatStartedEvent`` is emitted after the monitoring socket was created and before the ``hello`` is sent to the server | ||
|
||
#. Create a mock TCP server (example shown below) that pushes a ``client connected`` event to a shared array when a client connects and a ``client hello received`` event when the server receives the client hello and then closes the connection:: | ||
|
||
let events = []; | ||
server = createServer(clientSocket => { | ||
events.push('client connected'); | ||
|
||
clientSocket.on('data', () => { | ||
events.push('client hello received'); | ||
clientSocket.end(); | ||
}); | ||
}); | ||
server.listen(9999); | ||
|
||
#. Create a client with ``serverSelectionTimeoutMS: 500`` and listen to ``ServerHeartbeatStartedEvent`` and ``ServerHeartbeatFailedEvent``, pushing the event name to the same shared array as the mock TCP server. | ||
|
||
#. Attempt to connect client to previously created TCP server, catching the error when the client fails to connect | ||
|
||
#. Assert that the event array has the following contents in the same order:: | ||
|
||
['serverHeartbeatStartedEvent', 'client connected', 'client hello received', 'serverHeartbeatFailedEvent'] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes I think that test would work as well. Would it be easier to implement?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could we remove the serverConnectionId bits from the PR title and description?
@@ -462,7 +464,7 @@ The following key-value pairs are common to all or several log messages and MUST | |||
* - serverConnectionId | |||
- Heartbeat-related log messages | |||
- Int | |||
- The server's ID for the monitoring connection, if known. This value will be unknown and can be omitted in certain cases, e.g. the first | |||
- The server's ID for the monitoring connection, if known. This value will be unknown and MUST be omitted in certain cases, e.g. the first |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This should also be pushed out to the other ticket.
|
||
#. Attempt to connect client to a local port that is not listening for connections. e.g. ``mongodb://localhost:9999`` | ||
|
||
#. Assert that the event array has the following contents in the same order:: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since SDAM can try to connect multiple times, I would say:
"Assert that the first two events in the array are::"
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM!
Changes
ServerHeartbeatStartedEvent
MUST be emitted just after monitor creates monitoring connection and before the first hello is emitted.POC on Node driver
Please complete the following before merging: