Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

SAI-4661: fix race condition in singleton-per-collection StateWatcher creation #151

Merged

Conversation

magibney
Copy link
Collaborator

No tests yet; will be tricky to test this, esp because I'm not 100% sure where the race condition actually comes in. Buuut I can definitely see several opportunities for race conditions, and this PR will fix the issue by leaning on the deduplication of collectionWatches and slightly modifying the logic of how StateWatcher determines whether to expire or reset the watch.

collectionWatches.compute(
collection,
(k, v) -> {
if (v == null) {
v = new StatefulCollectionWatch();
Copy link
Collaborator

@hiteshk25 hiteshk25 Sep 29, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@magibney its not clear to me whether current code has any race condition here?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some explanation on the most relevant section of the PR code.

Comment on lines -1334 to +1343
if (!collectionWatches.watchedCollections().contains(coll)) {
// This collection is no longer interesting, stop watching.
StatefulCollectionWatch scw = collectionWatches.statefulWatchesByCollectionName.get(coll);
if (scw == null || scw.associatedWatcher != this) {
// Collection no longer interesting, or we have been replaced by a different watcher.
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The substance of the fix is in these three lines. Since StateWatcher instance do a deferred check against collectionWatches in order to determine whether to exit, it's possible that by the time a StateWatcher checks here, the watch it was created to serve has been removed, and a new watch inserted (and new StateWatcher created) before this check. In that case it's possible to accumulate persistent StateWatcher instances that all notify callbacks on the same event.

The fix is for the StateWatcher to check whether it is the instance responsible for the current entry in collectionWatches, and exit if not. The rest of the changes are basically just to ensure that The new StatefulCollectionWatch.associatedWatcher field is never null once it's been inserted in the map.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe will do f2f session for it.

the watch it was created to serve has been removed, and a new watch inserted

When/where we remove watch?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The above 3 lines is where the StateWatcher is "removed", passively, by having the process() method return early without resetting itself. There are a number of places where DocCollectionWatchers are started, including waitForState(), etc. Not actually sure which of these is the likely culprit, but the test definitely reproduces the problematic behavior we were observing, in a way that should not be possible to do (and is fixed by this change).

int iterations = 10;
for (int i = 0; i < extraWatchers; i++) {
// add and remove a bunch of watchers
DocCollectionWatcher w = (coll) -> false;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I assume generally we want to return true here. Wondering what happens if we return true here?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it depends. Some watches are one-off (like in waitForState), but there's at least one watcher that's persistent for the life of the core: ZkController.UnloadCoreOnDeletedWatcher. In this test, because the point is that we add-and-remove quickly, returning true wouldn't change anything ... it would just be another way of removing the watcher.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We just want to make sure with prs collection and restart, we see this behavior?

try {
reader.registerDocCollectionWatcher("c1", w);
} finally {
reader.removeDocCollectionWatcher("c1", w);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see test is removing watch here - would be good to create test with prs collection with n shards and then restart. I'm bit trying to simulate our solr node behavior

@@ -642,8 +647,10 @@ private void constructState(Set<String> changedCollections) {

/** Refresh collections. */
private void refreshCollections() {
for (String coll : collectionWatches.watchedCollections()) {
new StateWatcher(coll).refreshAndWatch();
Copy link
Collaborator

@patsonluk patsonluk Oct 2, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good catch on this @magibney !

I traced the code logic and design a little bit. The principle design for removal of watches (ie NOT re-install the watch to ZK after event trigger) appears to tie to the absence of collection name in collectionWatches.watchedCollections() in StateWatcher#process (which you pointed out later)

IMHO, this new StateWatcher() call is a bit of an oddball, it skips the checks on collectionWatches (unlike 2 other places that instantiate the StateWatcher, which check collectionWatches and only create new instance if there's no entry for the target collection), hence this StateWatcher fails to remove itself at https://github.com/cowpaths/fullstory-solr/pull/151/files#diff-9b1918d641f78a8c245306b47ffd7a4554a12771ed4cb42293a411537fe7b54dL1334.

I cannot find documentation stating that there could ever be only one StateWatcher instance per collection per zkReader - or perhaps I missed it, but such could have been implied by the current design of collectionWatches.

I'm unsure of the expected behaviors of this refreshCollections method (javadoc seems lacking 😅 ), it might simply want to re-fetch the collection states instead of installing a new zk watch? It could be using new StateWatcher(coll).refreshAndWatch(); out of convenience, but it installs an extra zk watch for a collection already exists in collectionWatches.

Or if this also wants to install a new "replacement" watch (of a "stale" watch, though such watch could still be working if we see duplicates of event notification as described in https://issues.apache.org/jira/browse/SOLR-11535), then i think @magibney 's solution makes perfect sense - instead of having only collectionWatches field to keep track of what collections have zk watch installed, we also need to know exactly which StateWatcher instance is being used.

@patsonluk
Copy link
Collaborator

@magibney i 💯 agree with your description as in https://issues.apache.org/jira/browse/SOLR-11535?focusedCommentId=17770574&page=com.atlassian.jira.plugin.system.issuetabpanels%3Acomment-tabpanel#comment-17770574 .

I'm wondering though (which your are probably also working on it), how did such race condition happen during our restart. AFAIK, the ONLY place that could remove the collection from collectionWatches is removeDocCollectionWatcher - ignore registerCore and unregisterCore, those are not referenced anymore - we should remove them. I think you have also made a mention of DocCollectionWatcher at here. I believe DocCollectionWatcher is a higher level watcher, which a StateWatcher (usually one per collection) is the one installed to zk, and then on zk event, we look up the StatefulCollectionWatch associated with the path (ie collection), find out all the DocCollectionWatcher(s) registered against such StatefulCollectionWatch and notify them all. The naming is super confusing to be honest 😭

Therefore instantiation of DocCollectionWatcher and invocation of registerDocCollectionWatcher is not supposed to install new StateWatcher on ZK unless it's the first DocCollectionWatcher registration on such collection as the logic is guarded by the compute method of collectionWatches and the ref count check.

And if we look at invocation of removeDocCollectionWatcher again, it's not supposed to remove the underlying StateWatcher unless the core is unloaded, as it's guarded by ref count and the first registration, which the listener is ZkController#UnloadCoreOnDeletedWatcher.

So a race condition for collectionWatches might arise if we are unloading a core and somehow some logic is registering new DocCollectionWatcher???

Another thing that DID happen, which is refreshCollections that add another StateWatcher which bypass the collectionWatches checks, as I could find logs of o.a.s.c.ZkController ZooKeeper session re-connected ... refreshing core states after session expiration. on several nodes (but not too many of them, only 9 matches during the first hour of restart), the refreshCollections call would have doubled the amount of StateWatcher (ie watches installed against ZK).

That alone probably did not cause a "runaway" zkCallback, but perhaps did exacerbate the problem? 🤔

@patsonluk
Copy link
Collaborator

@magibney so i debugged a little bit and it's true that there are several zk watch installation from the preRegister steps during restart, even though the corresponding entries are purged from the ZkStateReader.collectionWatches, as far as the "passive" removal from StateWatcher#process is not triggered before the more "permanent" watch with UnloadCoreOnDeletedWatcher is installed, then those dangling StateWatcher will just follow the collection until the core is unloaded.

}

DocCollection state = clusterState.getCollectionOrNull(collection);
if (stateWatcher.onStateChanged(state) == true) {
removeDocCollectionWatcher(collection, stateWatcher);
if (docCollectionWatcher.onStateChanged(state) == true) {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice rename here! the original stateWatcher was confusing as it's not really a StateWatcher 😓

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yeah, like 9 different concepts of "state" and "watcher" going on here :-)

@@ -372,6 +372,11 @@ private StatefulCollectionWatch compute(

private static class StatefulCollectionWatch extends CollectionWatch<DocCollectionWatcher> {
private DocCollection currentState;
private volatile StateWatcher associatedWatcher;
Copy link
Collaborator

@patsonluk patsonluk Oct 2, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we add some description for this field, there are DocCollectionWatcher and StateWatcher here, which can be confusing for devs unfamiliar with code (OR forgetful dev like me 😅 ).

Perhaps it's good to mention that this is the underlying ZK level watcher that we tracking. And we need this to stop re-installing old ZK StateWatcher?

@hiteshk25
Copy link
Collaborator

@magibney @patsonluk Thanks for working/debugging/fixing/explaning this issue thoroughly. Was bit thinking more about this and thought we should be registering StateWatcher once only (as guys says). Here is my patch, briefly tested on playpen. But let me know what you guys think of it.

https://github.com/cowpaths/fullstory-solr/pull/154/files

image

…er creation

back-ported from active upstream PR as of commit
405e1a8
@magibney magibney changed the base branch from fs/branch_9x to fs/branch_9_3 October 6, 2023 12:19
@magibney magibney force-pushed the michaelgibney/SAI-4661-statewatcher-race-condition branch from adcfe3a to 001d506 Compare October 6, 2023 12:19
@magibney
Copy link
Collaborator Author

magibney commented Oct 6, 2023

switched the base branch to fs/branch_9_3 and backported the in-progress upstream fix. (Will merge this to upstream branch_9x soon, so only doing fs/branch_9_3 here).

Copy link
Collaborator

@patsonluk patsonluk left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM! I think the change proposed is safe and straightforward!

I agree that we would want to do more cleanups later on (on refresh and remove unused methodsd) or use the other proposal @hiteshk25 suggested 👍🏼 https://github.com/cowpaths/fullstory-solr/pull/154/files

@magibney magibney merged commit 84d0d39 into fs/branch_9_3 Oct 6, 2023
2 of 3 checks passed
justinrsweeney pushed a commit that referenced this pull request Apr 26, 2024
…er creation (#151)

back-ported from active upstream PR as of commit
405e1a8
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants