MongoDriver unified event thread #32

prdoyle · 2023-06-15T03:25:10Z

prdoyle
Jun 15, 2023

In the recent "resilient" MongoDriver work, I have been struggling to distill out some principles of operation so I can understand how MongoDriver ought to behave in all situations.

I think I might have just figured it out.

The idea is to have all change event operations in the same background thread. This concept actually includes the loading of the initial state which, though not actually a change event, is motivated by the need to coordinate carefully with the event stream so we process only events that occurred after that initial state.

The psudocode for the event thread becomes a scheduleWithFixedDelay call with the following procedure:

Connect
a. Open change stream cursor, either
- Using last processed resume token, or
- Getting a fresh resume token, and the initial event
b. Detect format, open format driver, load initial state
While there are more change events, process them
Finally: disconnect

This achieves numerous things all at once, compared to the current design, which does only #2 on the event thread:

It uses the same background thread for event processing and event stream connection. This is really simple to explain, and eliminates all races in all stream operations because they’re on the same thread.
It repeatedly attempts to reinitialize with a delay, thereby implementing liveness
It ensures there is only one reconnection at a time (because there's only one thread doing it!)
It doesn’t uselessly check whether to reconnect when the connection is just fine already
We probably don’t need an explicit Session concept: a session would just be one execution of this procedure
Shutting down the thread pool cleanly kills all event stream operations for the entire bosk

I need to think about this some more, but it seems promising.

prdoyle · 2023-06-15T11:26:17Z

prdoyle
Jun 15, 2023
Author

A problem with this concept: because its only approach to reinitializing is to wait for the "fixed delay" to elapse and start over, this means every reload event (like a refurbish that changes the database format) is delayed by that much time, while it could actually succeed immediately.

Now I'm thinking, the event task should be a loop that only terminates when an unexpected exception is thrown. For expected exceptions, we immediately attempt reconnection. For unexpected exceptions, the task terminates, the delay time elapses, and then we try again. This is actually another pretty clean operating principle: the reconnection delay happens if and only if the reconnection loop terminates with an unexpected exception.

And, when it terminates with an exception, that's probably a good time to discard the resume token: if things have gotten so bad that we need to pause and try to start over, we should start fresh. (We could also discard the resume token under other circumstances as appropriate.)

So something like:

ex.scheduleWithFixedDelay(this::eventLoop);

...
void eventLoop() {
	try {
		while (true) {
			connect();
			try {
				while (true) {
					processEvent(cursor.next());
				}
			} catch (CertainSpecificExceptions) {
				// log, disconnect, and reconnect immediately
			} finally {
				disconnect();
			}
		}
	} finally {
		discardResumeToken();
	}
}

Some care needs to be taken that the outer loop doesn't go into a spin retrying the connection over and over: failures to connect really need to throw exceptions.

0 replies

prdoyle · 2023-06-15T12:14:07Z

prdoyle
Jun 15, 2023
Author

I should also mention: there's a (minor) challenge around implementing BoskDriver.initialState here, but it's not insurmountable.

initialState is different from the other operations because the loading of the current state must be done synchronously, so if we do it in another thread, we're going to need a way to wait for it.

1 reply

prdoyle Jun 24, 2023
Author

I've implemented it using FutureTask which is a component that allows a method to be called, then memoizes the outcome (return value or exception) and allows multiple threads to access it. I do the actual execution of this task on the background thread (for consistency with the reinitialization case, which also happens on the background thread in response to a change stream event), and the foreground initialRoot call waits for it.

gradycsjohnson · 2023-06-15T13:45:12Z

gradycsjohnson
Jun 15, 2023

I like it!
Couple questions:

Are we confident that disconnections always throw an exception?
Is it possible to for the event loop to get stuck indefinitely/intolerably without technically disconnecting?
If so, is there a way we can reliably distinguish between a loop that gets stuck and a dormant event stream?

3 replies

prdoyle Jun 16, 2023
Author

Good questions.

I'm pretty confident. Without an exception, the cursor would return an event. (After the last event, it throws an exception, like Java Iterator.)
I don't know for sure if that's possible but I susect not. The change stream works by polling, and each polling operation has a timeout, after which it throws. I think if cursor.next() isn't returning, that's because the polling keeps telling it that the collection is dormant.

prdoyle Jun 16, 2023
Author

This thinking sorta goes out the window for those portions of the event loop during which we're not waiting for cursor.next()... but I think most of it applies there too.

gradycsjohnson Jun 17, 2023

Okay great!

prdoyle · 2023-06-17T13:57:54Z

prdoyle
Jun 17, 2023
Author

I thought I'd make a note of something here too.

Today I came to wonder: why do we need a "disconnected state" where the driver uniformly throws an exception for all driver operations? Why can't each of those operations succeed or fail on their own merit? If the database is inaccessible, each submitted update / flush will fail anyway; if it's accessible, then great, the operations can succeed.

I think the risk is in the case where the driver loses confidence in its ability to understand the database state. For example, if it can't determine the right format, or receives a drop event indicating that the collection has disappeared. In these cases, it's not clear that the driver should attempt any writes, because they could corrupt the database. (Say, for example, a newer replica of the application did a refurbish and upgraded the database into a state that our replica doesn't support.)

The disconnected state may not be required for cases of interrupted connectivity, but it may be simpler to handle these cases the same way; after all, what is the value of avoiding a "disconnected state" when you can't reach the database anyway?

0 replies

prdoyle · 2023-06-17T16:30:05Z

prdoyle
Jun 17, 2023
Author

I think the mental shift of this idea is as follows.

The "v2" design considers a live connection to MongoDB as being the "normal" state, and then attempts various corrective actions when things go wrong. The corrective actions could occur on various threads at various times, and must reach the desired state starting from whatever state they find themselves in. Nested corrections could even occur while another corrective action is still in progress.

The "v3" design looks at the driver lifetime as a sequence of (ideally just one) connect-process-disconnect cycles. The whole cycle is initiated by a single background thread, and so these operations have greatly reduced possibility for race conditions. They can use things like catch blocks and try-with-resources to exit with a clean state, rather than having to clean up during each corrective action. Initialization and orderly shutdown naturally share the same logic with reconnection.

The actual driver operations interact with this background thread in a number of ways:

Performing database writes to implement the submit methods, which cause the background thread to receive change events;
Responding to callbacks via the Listener interface; and
Blocking to wait for a reconnection operation to complete

All in all, I think it's going to be much easier to reach a high degree of confidence in the v3 design.

I'm also skipping resume tokens entirely for now. In v2 they made things very difficult to reason about and generally just made things worse, so I didn't use them anyway; in v3, this could lead to a substantial reduction in complexity, and I can try to add resume token support again later if we find a need.

0 replies

prdoyle · 2023-06-20T19:58:51Z

prdoyle
Jun 20, 2023
Author

I've encountered a problem with v3 that I didn't see on v2. It's documented in this MongoDB support case, though I'm not sure how much of this is visible to the public.

This seems to be a showstopper. Until I know more about why this is failing. I can't really proceed with v3. 😞

I'll paste the description here:

The attached unit test has two threads. The "Setter" thread updates a field value from 0 to 1, and performs a read to verify that the 1 is indeed present in the database. The other "Opener" thread does the following, in order:

Open a change stream.

Read the field. Verify that it's 0 (or else start over and try again)

Wait for a change event.

The unit test fails, indicating that the change event never arrived before the (rather generous 5-second) timeout expires.

By my understanding of change streams, this is not supposed to occur. Because the stream was opened (step 1) before the read (step 2) there are two valid outcomes; either:

the read should return 1, or

the read should return 0 and then the change stream should return an event changing the field from 0 to 1

The test fails because neither of these is happening.

0 replies

prdoyle · 2023-06-21T22:19:58Z

prdoyle
Jun 21, 2023
Author

Alright, I think @zAlbee figured it out: when we read the revision field, we need to use read concern LOCAL in order to get the latest value. If we use our usual MAJORITY read concern, then we risk getting a stale value from the past, which makes us hang waiting for an update that already occurred.

The usual risk of LOCAL is that you can observe values that are not yet durable, but I don't really think that's a concern in this case. If something goes wrong so that a write is not durable, it's hard to see how Bosk wouldn't reconnect and start again anyhow.

0 replies

prdoyle · 2023-06-24T12:58:48Z

prdoyle
Jun 24, 2023
Author

After a bunch more bug fixing, v3 is now the most reliable MongoDriver. I wrote some automation to re-run the automated tests, and I actually had to disable the v1 and v2 drivers because they were failing more often than v3. I left the tests running in a loop overnight, and in 8 hours they passed 72 times with no failures, which is a new record.

My current plan:

Tidy v3 and merge it
Delete v2
Publish a release
Update the docs
Make v3 the new default
Publish another release

11 replies

prdoyle Jun 25, 2023
Author

For the epoch, I'm thinking of a random UUID rather than some sort of meaningful number. I think adding a second incrementing counter would just beg the question.

prdoyle Jun 25, 2023
Author

Docs are parially updated in #40 but I think there's more to do. Some other docs, like the User's guide, contain some information that would be outdated once we switch to v3.

prdoyle Jun 25, 2023
Author

Ok I think I'm done updating the docs

prdoyle Jun 25, 2023
Author

5 - v3 is now the default (#41)

prdoyle Jun 25, 2023
Author

6 - I've kicked off a new release called 0.0.90 to make v3 the default

prdoyle · 2023-06-25T23:21:00Z

prdoyle
Jun 25, 2023
Author

Alright, it's released! Version 0.0.90 is the first one to use the new MongoDriver. 🎉

0 replies

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

MongoDriver unified event thread #32

{{title}}

{{editor}}'s edit

{{editor}}'s edit

Replies: 9 comments 15 replies

{{title}}

{{editor}}'s edit

{{editor}}'s edit

{{title}}

{{title}}

{{title}}

{{title}}

{{editor}}'s edit

{{editor}}'s edit

{{title}}

{{title}}

{{title}}

{{title}}

{{editor}}'s edit

{{editor}}'s edit

{{title}}

{{title}}

{{title}}

{{editor}}'s edit

{{editor}}'s edit

{{title}}

{{title}}

{{title}}

{{title}}

{{title}}

{{title}}

Select a reply

MongoDriver unified event thread #32

prdoyle Jun 15, 2023

Replies: 9 comments · 15 replies

prdoyle Jun 15, 2023 Author

prdoyle Jun 15, 2023 Author

prdoyle Jun 24, 2023 Author

gradycsjohnson Jun 15, 2023

prdoyle Jun 16, 2023 Author

prdoyle Jun 16, 2023 Author

gradycsjohnson Jun 17, 2023

prdoyle Jun 17, 2023 Author

prdoyle Jun 17, 2023 Author

prdoyle Jun 20, 2023 Author

prdoyle Jun 21, 2023 Author

prdoyle Jun 24, 2023 Author

prdoyle Jun 25, 2023 Author

prdoyle Jun 25, 2023 Author

prdoyle Jun 25, 2023 Author

prdoyle Jun 25, 2023 Author

prdoyle Jun 25, 2023 Author

prdoyle Jun 25, 2023 Author

prdoyle
Jun 15, 2023

Replies: 9 comments 15 replies

prdoyle
Jun 15, 2023
Author

prdoyle
Jun 15, 2023
Author

prdoyle Jun 24, 2023
Author

gradycsjohnson
Jun 15, 2023

prdoyle Jun 16, 2023
Author

prdoyle Jun 16, 2023
Author

prdoyle
Jun 17, 2023
Author

prdoyle
Jun 17, 2023
Author

prdoyle
Jun 20, 2023
Author

prdoyle
Jun 21, 2023
Author

prdoyle
Jun 24, 2023
Author

prdoyle Jun 25, 2023
Author

prdoyle Jun 25, 2023
Author

prdoyle Jun 25, 2023
Author

prdoyle Jun 25, 2023
Author

prdoyle Jun 25, 2023
Author

prdoyle
Jun 25, 2023
Author