Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

server: use tenant capability watcher in orchestrator #112001

Merged
merged 1 commit into from
Oct 9, 2023

Conversation

stevendanna
Copy link
Collaborator

@stevendanna stevendanna commented Oct 7, 2023

This changes over from polling to using the capability watcher to observe virtual cluster service mode changes.

On the surface, it may seem like this make us less responsive since our poll interval was 1 seconds and by default we will only see an update every 3 seconds from the watcher.

However, when we were using the polling approach, what was likely to happen is that the server would fail to start up until the tenant watcher caught up anyway, since the tenant server does a service-mode check using the capability watcher.

Fixes #107827.
Epic: CRDB-26691

Release note: None

@stevendanna stevendanna requested a review from knz October 7, 2023 23:55
@stevendanna stevendanna requested review from a team as code owners October 7, 2023 23:55
@stevendanna stevendanna requested a review from a team October 7, 2023 23:55
@blathers-crl
Copy link

blathers-crl bot commented Oct 7, 2023

It looks like your PR touches production code but doesn't add or edit any test code. Did you consider adding tests to your PR?

🦉 Hoot! I am a Blathers, a bot for CockroachDB. My owner is dev-inf.

@cockroach-teamcity
Copy link
Member

This change is Reviewable

@stevendanna
Copy link
Collaborator Author

The main thing I don't love about this is that to avoid missing updates while processing updates, we currently block the tenant capability watcher until the callback is done.

An alternative could be some interface that never blocks but lets you know if you likely missed updates so that you could do a new complete scan yourself.....

Copy link
Contributor

@knz knz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewed 5 of 5 files at r1, all commit messages.
Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @stevendanna)


-- commits line 16 at r1:
Pro tip: no need to include Epic: xxx in individual commit messages if there's a common epic for the entire PR, which can be mentioned in the PR description.


pkg/multitenant/tenantcapabilities/tenantcapabilitieswatcher/watcher.go line 169 at r1 (raw file):

// OnUpdate is called after a tenant state update is received. The
// provided function must take care to avoid blocking for too long.
func (w *Watcher) OnUpdate(cb onUpdateCallback) func() {

IMHO this is not the right abstraction. There are two problems here:

  1. as you've already identified in the commit message, this will block the rangefeed until the callbacks have completed. We don't really want that -- this watcher is also used to propagate data through the tenant connector, and to update the RPC authorizer. We don't want that to become indefinitely stale just because the server controller has a hiccup.

  2. we already have standardized a robust asynchronous update notification mechanism across the various watchers; and this OnUpdate protocol you're creating here is diverging from that standard mechanism. For the sake of uniformity (learnability, maintainance burden etc) I recommend we continue to use the standard protocol.
    May I thus recommend you consider reusing this, using the following examples for inspiration:

  • GetInfo in this file.
  • GetAllTenantOverrides in pkg/server/tenantsettingswatcher/watcher.go
  • GetTenantOverrides in that file too

Copy link
Collaborator Author

@stevendanna stevendanna left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @knz)


pkg/multitenant/tenantcapabilities/tenantcapabilitieswatcher/watcher.go line 169 at r1 (raw file):

Previously, knz (Raphael 'kena' Poss) wrote…

IMHO this is not the right abstraction. There are two problems here:

  1. as you've already identified in the commit message, this will block the rangefeed until the callbacks have completed. We don't really want that -- this watcher is also used to propagate data through the tenant connector, and to update the RPC authorizer. We don't want that to become indefinitely stale just because the server controller has a hiccup.

  2. we already have standardized a robust asynchronous update notification mechanism across the various watchers; and this OnUpdate protocol you're creating here is diverging from that standard mechanism. For the sake of uniformity (learnability, maintainance burden etc) I recommend we continue to use the standard protocol.
    May I thus recommend you consider reusing this, using the following examples for inspiration:

  • GetInfo in this file.
  • GetAllTenantOverrides in pkg/server/tenantsettingswatcher/watcher.go
  • GetTenantOverrides in that file too

:D I started with something like this but convinced myself I was going to miss an update. But, looking at it again I see that I just had some ordering wrong. I've moved this to an GetAllTenants() method that returns all of the tenants and a channel that is closed if anything changes.

Copy link
Contributor

@knz knz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewed 2 of 2 files at r2, all commit messages.
Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @stevendanna)


pkg/multitenant/tenantcapabilities/tenantcapabilitieswatcher/watcher.go line 179 at r2 (raw file):

		}
	}
	w.mu.anyChangeChs = append(w.mu.anyChangeChs, updateCh)

You only need 1 channel. No need for a slice. The channel is always defined, and closed-reallocated on every update event.


pkg/multitenant/tenantcapabilities/tenantcapabilitieswatcher/watcher.go line 306 at r2 (raw file):

	if len(updates) > 0 {
		w.closeAnyChangeChs()

This is suspicious. The exijsting pattern is, to notify, to create a new channel after an unconditional close.


pkg/server/server_controller_orchestration.go line 48 at r2 (raw file):

		for {

nit: remove empty line.

This changes over from polling to using the capability watcher to
observe virtual cluster service mode changes.

On the surface, it may seem like this make us less responsive since
our poll interval was 1 seconds and by default we will only see an
update every 3 seconds from the watcher.

However, when we were using the polling approach, what was likely to
happen is that the server would fail to start up until the tenant
watcher caught up anyway, since the tenant server does a service-mode
check using the capability watcher.

Epic: none

Release note: None
Copy link
Collaborator Author

@stevendanna stevendanna left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @knz)


pkg/multitenant/tenantcapabilities/tenantcapabilitieswatcher/watcher.go line 179 at r2 (raw file):

Previously, knz (Raphael 'kena' Poss) wrote…

You only need 1 channel. No need for a slice. The channel is always defined, and closed-reallocated on every update event.

Done.


pkg/multitenant/tenantcapabilities/tenantcapabilitieswatcher/watcher.go line 306 at r2 (raw file):

Previously, knz (Raphael 'kena' Poss) wrote…

This is suspicious. The exijsting pattern is, to notify, to create a new channel after an unconditional close.

We can elide the length check, but it'll cause unnecessary work. handleUpdate is called with an empty incremental update every time the frontier advances. When updating on incremental updates in handleIncrementalUpdate, we close the relevant channels while iterating the updates.

Copy link
Contributor

@knz knz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM -- you have an optimization to reduce allocs further but I'll leave it up to you.

Reviewed 2 of 2 files at r3.
Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @stevendanna)


pkg/multitenant/tenantcapabilities/tenantcapabilitieswatcher/watcher.go line 306 at r2 (raw file):

Previously, stevendanna (Steven Danna) wrote…

We can elide the length check, but it'll cause unnecessary work. handleUpdate is called with an empty incremental update every time the frontier advances. When updating on incremental updates in handleIncrementalUpdate, we close the relevant channels while iterating the updates.

Yes, I agree the if len(updates) > 0 is useful. I was just picking up on the for loop.


pkg/multitenant/tenantcapabilities/tenantcapabilitieswatcher/watcher.go line 173 at r3 (raw file):

	defer w.mu.RUnlock()

	entries := make([]tenantcapabilities.Entry, 0, len(w.mu.store))

nit: ideally, this slice would be populated during the updates, and simply retrieved here by reference.
(This would need to create a new slice every time the channel is closed/refreshed.)

@stevendanna
Copy link
Collaborator Author

Thanks for the careful review.

I'll do the allocation optimization as a follow up shortly but I'm going to take advantage of a green CI run and merge this for now.

bors r=knz

@craig
Copy link
Contributor

craig bot commented Oct 9, 2023

Build failed (retrying...):

@craig craig bot merged commit 0a99ac2 into cockroachdb:master Oct 9, 2023
@craig
Copy link
Contributor

craig bot commented Oct 9, 2023

Build succeeded:

stevendanna added a commit to stevendanna/cockroach that referenced this pull request Oct 13, 2023
In cockroachdb#112001 we introduced a bug and an unintended behaviour change.

The bug is that if we receive a notification of a state change from
none to shared when the server is still shutting down, that state
change will be ignored. Namely, the following can happen:

1. ALTER VIRTUAL CLUSTER a STOP SERVICE
2. Watcher gets notification of shutdown and notifies virtual
   cluster's SQL server.
3. Tenant 1 starts shutdown but does not fully complete it
4. ALTER VIRTUAL CLUSTER a START SERVICE SHARED
5. Watcher notifies the server orchestrator; but, since the SQL server has
   not finished stopping from the previous stop request, it appears as if
   it is already started.
6. Tenant 1 finishes shutdown.
7. Server orchestrator never again tries to start the virtual cluster.

The newly added test reveals this under stress.

The behaviour change is that previously if a SQL server for a virtual
cluster failed to start up, it would previously be restarted.

Here, we fix both of these by re-introducing a periodic polling of the
service state.  Unlike the previous polling, we poll the watcher state
so we are not generating a SQL query every second.

Further, since we are now calling the tenantcapabailities watcher
GetAllTenants method every second in addition to on every update, I've
moved where we allocate the list of all tenants to our handle update
call.

An alternative here would be to revert cockroachdb#112001 completely.  I think
there are still advantage to using the watcher: not generating a SQL
query on every node once per second and after the integration of

Fixes cockroachdb#112077

Release note: None
stevendanna added a commit to stevendanna/cockroach that referenced this pull request Oct 16, 2023
In cockroachdb#112001 we introduced a bug and an unintended behaviour change.

The bug is that if we receive a notification of a state change from
none to shared when the server is still shutting down, that state
change will be ignored. Namely, the following can happen:

1. ALTER VIRTUAL CLUSTER a STOP SERVICE
2. Watcher gets notification of shutdown and notifies virtual
   cluster's SQL server.
3. Tenant "a" starts shutdown but does not fully complete it
4. ALTER VIRTUAL CLUSTER a START SERVICE SHARED
5. Watcher notifies the server orchestrator; but, since the SQL server has
   not finished stopping from the previous stop request, it appears as if
   it is already started.
6. Tenant "a" finishes shutdown.
7. Server orchestrator never again tries to start the virtual cluster.

The newly added test reveals this under stress.

The behaviour change is that previously if a SQL server for a virtual
cluster failed to start up, it would previously be restarted.

Here, we fix both of these by re-introducing a periodic polling of the
service state.  Unlike the previous polling, we poll the watcher state
so we are not generating a SQL query every second.

Further, since we are now calling the tenantcapabailities watcher
GetAllTenants method every second in addition to on every update, I've
moved where we allocate the list of all tenants to our handle update
call.

An alternative here would be to revert cockroachdb#112001 completely.  I think
there are still advantage to using the watcher: not generating a SQL
query on every node once per second and more responsive server startup
after the integration of cockroachdb#112094.

Fixes cockroachdb#112077

Release note: None
craig bot pushed a commit that referenced this pull request Oct 16, 2023
112295: server: avoid missing service mode changes  r=adityamaru,yuzefovich a=stevendanna

In #112001 we introduced a bug and an unintended behaviour change.

The bug is that if we receive a notification of a state change from
none to shared when the server is still shutting down, that state
change will be ignored. Namely, the following can happen:

1. ALTER VIRTUAL CLUSTER a STOP SERVICE
2. Watcher gets notification of shutdown and notifies virtual
   cluster's SQL server.
3. Tenant 1 starts shutdown but does not fully complete it
4. ALTER VIRTUAL CLUSTER a START SERVICE SHARED
5. Watcher notifies the server orchestrator; but, since the SQL server has
   not finished stopping from the previous stop request, it appears as if
   it is already started.
6. Tenant 1 finishes shutdown.
7. Server orchestrator never again tries to start the virtual cluster.

The newly added test reveals this under stress.

The behaviour change is that previously if a SQL server for a virtual
cluster failed to start up, it would previously be restarted.

Here, we fix both of these by re-introducing a periodic polling of the
service state.  Unlike the previous polling, we poll the watcher state
so we are not generating a SQL query every second.

Further, since we are now calling the tenantcapabailities watcher
GetAllTenants method every second in addition to on every update, I've
moved where we allocate the list of all tenants to our handle update
call.

An alternative here would be to revert #112001 completely.  I think
there are still advantage to using the watcher: not generating a SQL
query on every node once per second and after the integration of

Fixes #112077

Release note: None

Co-authored-by: Steven Danna <[email protected]>
stevendanna added a commit to stevendanna/cockroach that referenced this pull request Oct 16, 2023
In cockroachdb#112001 we introduced a bug and an unintended behaviour change.

The bug is that if we receive a notification of a state change from
none to shared when the server is still shutting down, that state
change will be ignored. Namely, the following can happen:

1. ALTER VIRTUAL CLUSTER a STOP SERVICE
2. Watcher gets notification of shutdown and notifies virtual
   cluster's SQL server.
3. Tenant "a" starts shutdown but does not fully complete it
4. ALTER VIRTUAL CLUSTER a START SERVICE SHARED
5. Watcher notifies the server orchestrator; but, since the SQL server has
   not finished stopping from the previous stop request, it appears as if
   it is already started.
6. Tenant "a" finishes shutdown.
7. Server orchestrator never again tries to start the virtual cluster.

The newly added test reveals this under stress.

The behaviour change is that previously if a SQL server for a virtual
cluster failed to start up, it would previously be restarted.

Here, we fix both of these by re-introducing a periodic polling of the
service state.  Unlike the previous polling, we poll the watcher state
so we are not generating a SQL query every second.

Further, since we are now calling the tenantcapabailities watcher
GetAllTenants method every second in addition to on every update, I've
moved where we allocate the list of all tenants to our handle update
call.

An alternative here would be to revert cockroachdb#112001 completely.  I think
there are still advantage to using the watcher: not generating a SQL
query on every node once per second and more responsive server startup
after the integration of cockroachdb#112094.

Fixes cockroachdb#112077

Release note: None
stevendanna added a commit to stevendanna/cockroach that referenced this pull request Oct 17, 2023
In cockroachdb#112001 we introduced a bug and an unintended behaviour change.

The bug is that if we receive a notification of a state change from
none to shared when the server is still shutting down, that state
change will be ignored. Namely, the following can happen:

1. ALTER VIRTUAL CLUSTER a STOP SERVICE
2. Watcher gets notification of shutdown and notifies virtual
   cluster's SQL server.
3. Tenant "a" starts shutdown but does not fully complete it
4. ALTER VIRTUAL CLUSTER a START SERVICE SHARED
5. Watcher notifies the server orchestrator; but, since the SQL server has
   not finished stopping from the previous stop request, it appears as if
   it is already started.
6. Tenant "a" finishes shutdown.
7. Server orchestrator never again tries to start the virtual cluster.

The newly added test reveals this under stress.

The behaviour change is that previously if a SQL server for a virtual
cluster failed to start up, it would previously be restarted.

Here, we fix both of these by re-introducing a periodic polling of the
service state.  Unlike the previous polling, we poll the watcher state
so we are not generating a SQL query every second.

Further, since we are now calling the tenantcapabailities watcher
GetAllTenants method every second in addition to on every update, I've
moved where we allocate the list of all tenants to our handle update
call.

An alternative here would be to revert cockroachdb#112001 completely.  I think
there are still advantage to using the watcher: not generating a SQL
query on every node once per second and more responsive server startup
after the integration of cockroachdb#112094.

Fixes cockroachdb#112077

Release note: None
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

server: use in-memory cache of tenant info in server controller
3 participants