Skip to content

Commit

Permalink
spanconfig: mark reconciliation job as idle
Browse files Browse the repository at this point in the history
Fixes cockroachdb#70538.

We have a forever running background AUTO SPAN CONFIG RECONCILIATION job
on tenant pods. To know when it's safe to wind down pods, we use the
number of currently running jobs as an indicator. Given the job is
forever running, we need an indicator to suggest that despite the job's
presence, it's safe to wind down.

In cockroachdb#74747 we added a thin API to the jobs subsystem to do just that,
with the intent of using it for idle changefeed jobs. We just cargo-cult
that same approach here to mark the reconciliation job as always idle.

Release note: None
  • Loading branch information
irfansharif authored and RajivTS committed Mar 6, 2022
1 parent 788451a commit a0fe7a6
Show file tree
Hide file tree
Showing 2 changed files with 37 additions and 0 deletions.
5 changes: 5 additions & 0 deletions pkg/spanconfig/spanconfigjob/job.go
Original file line number Diff line number Diff line change
Expand Up @@ -45,6 +45,11 @@ func (r *resumer) Resume(ctx context.Context, execCtxI interface{}) error {
rc := execCtx.SpanConfigReconciler()
stopper := execCtx.ExecCfg().DistSQLSrv.Stopper

// The reconciliation job is a forever running background job. It's always
// safe to wind the SQL pod down whenever it's running -- something we
// indicate through the job's idle status.
r.job.MarkIdle(true)

// Start the protected timestamp reconciler. This will periodically poll the
// protected timestamp table to cleanup stale records. We take advantage of
// the fact that there can only be one instance of the spanconfig.Resumer
Expand Down
32 changes: 32 additions & 0 deletions pkg/spanconfig/spanconfigmanager/manager_test.go
Original file line number Diff line number Diff line change
Expand Up @@ -236,3 +236,35 @@ func TestManagerCheckJobConditions(t *testing.T) {
tdb.Exec(t, `SET CLUSTER SETTING spanconfig.reconciliation_job.check_interval = '25m'`)
_ = checkInterceptCountGreaterThan(currentCount) // the job check interval setting triggers a check
}

// TestReconciliationJobIsIdle ensures that the reconciliation job, when
// resumed, is marked as idle.
func TestReconciliationJobIsIdle(t *testing.T) {
defer leaktest.AfterTest(t)()

var jobID jobspb.JobID
ctx := context.Background()
tc := testcluster.StartTestCluster(t, 1, base.TestClusterArgs{
ServerArgs: base.TestServerArgs{
Knobs: base.TestingKnobs{
SpanConfig: &spanconfig.TestingKnobs{
ManagerCreatedJobInterceptor: func(jobI interface{}) {
jobID = jobI.(*jobs.Job).ID()
},
},
},
},
})
defer tc.Stopper().Stop(ctx)

jobRegistry := tc.Server(0).JobRegistry().(*jobs.Registry)
testutils.SucceedsSoon(t, func() error {
if jobID == jobspb.JobID(0) {
return errors.New("waiting for reconciliation job to be started")
}
if !jobRegistry.TestingIsJobIdle(jobID) {
return errors.New("expected reconciliation job to be idle")
}
return nil
})
}

0 comments on commit a0fe7a6

Please sign in to comment.