-
Notifications
You must be signed in to change notification settings - Fork 3.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
sql: create general logging job to capture index usage stats #76752
sql: create general logging job to capture index usage stats #76752
Conversation
42b6818
to
b6fdb20
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewed 3 of 20 files at r1, all commit messages.
Reviewable status: complete! 0 of 0 LGTMs obtained (waiting on @ajwerner, @Azhng, and @THardy98)
Is there an epic/doc/issue you can point me to that describes what this is supposed to accomplish?
Is there an SQL interface for creating such schedules? What's the intended usage and use case? Basically, any information
that can help me review the code better.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewed 1 of 20 files at r1.
Reviewable status: complete! 0 of 0 LGTMs obtained (waiting on @ajwerner, @Azhng, and @THardy98)
pkg/sql/scheduledloggingjobs/scheduled_logging_job.go, line 31 at r2 (raw file):
////////////////////////////////////// //////// SCHEDULED JOB LOGIC //////// //////////////////////////////////////
nit: we don't really have this comment style in our codebase...
pkg/sql/scheduledloggingjobs/scheduled_logging_job.go, line 150 at r2 (raw file):
// If an existing logging job of the same type is found, an error is returned, // preventing a duplicate job from being created. func checkExistingLoggingJob(
normally, this should be needed -- scheduled job system, by default, ensures only 1 instance of job run at a time...
b57dade
to
6e87ad3
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: complete! 0 of 0 LGTMs obtained (waiting on @ajwerner, @Azhng, @miretskiy, and @THardy98)
Previously, miretskiy (Yevgeniy Miretskiy) wrote…
Is there an epic/doc/issue you can point me to that describes what this is supposed to accomplish?
Is there an SQL interface for creating such schedules? What's the intended usage and use case? Basically, any information
that can help me review the code better.
Linked the issue this is intended to resolve, and a quick note about the idea behind this change.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: complete! 0 of 0 LGTMs obtained (waiting on @ajwerner, @Azhng, and @miretskiy)
pkg/sql/scheduledloggingjobs/scheduled_logging_job.go, line 31 at r2 (raw file):
Previously, miretskiy (Yevgeniy Miretskiy) wrote…
nit: we don't really have this comment style in our codebase...
Removed.
pkg/sql/scheduledloggingjobs/scheduled_logging_job.go, line 150 at r2 (raw file):
Previously, miretskiy (Yevgeniy Miretskiy) wrote…
normally, this should be needed -- scheduled job system, by default, ensures only 1 instance of job run at a time...
Oh I didn't know that. How does it check whether an instance is of a job is already running?
Resolves: cockroachdb#72486 Previously, we did not capture index usage metrics to our telemetry logging channel. The intention of this change is to introduce a scheduler for jobs that emit logging events. Currently, we want to capture index usage statistics to the telemetry logging channel on a time interval, but being able to emit scheduled logs for different statistics/events would be useful telemetry going forward, which motivates the creation of a scheduler that supports different logging jobs. This change introduces the `scheduledloggingjobs` package. `CreateScheduledLoggingJob` in `scheduledloggingjobs`, allows you to create a scheduler with parameters for: - label - schedule interval - executor type - schedule details - execution arguments The Resumer and Executor logic for the scheduled jobs resides in `sql/scheduled_logging_jobs.go`. The Resumer receives a callback function that returns the logging events it wants to emit. The Executor receives the job details/progress (used to create the job record when the job is executed) and job type (to check that a job of the same type isn't already running when we try to create the job). The `scheduledloggingjobs` package also contains a `Controller` whose purpose is to `Start` all the scheduled logging jobs in its interface. The `Controller`'s `Start` method is hooked into the startup of the `sql.Server`. Assuming its `jobs.proto` definitions and `wrap.go` cases are covered, a new logging job can be added to this by: - defining its resume callback function - registering its log resumer and executor with `RegisterLogResumerAndScheduleExecutor` - defining its arguments (label, schedule interval, executor type, schedule details, execution arguments) and calling `CreateScheduledLoggingJob` - adding it to the `Controller` interface, and `Start` method (if you want to create the scheduled job on server startup) This change was made for internal use in mind, I didn't consider a SQL interface to create these schedules. Release note (sql change): Initial implementation of a general logging job, used to create a job that captures index usage statistics to the telemetry logging channel.
6e87ad3
to
b5db485
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: complete! 0 of 0 LGTMs obtained (waiting on @ajwerner, @miretskiy, and @THardy98)
pkg/sql/conn_executor.go, line 488 at r4 (raw file):
s.txnIDCache.Start(ctx, stopper) s.scheduledLoggingJobsController.Start(ctx)
.Start(ctx)
function here usually implies that there are one or more long-running background goroutines that need start using the stopper
here. In CRDB we don't directly start a goroutine using the go
keyword. Using the stopper here allows us to stop all the background goroutines when gracefully shutting down a server.
Looking at the scheduledLoggingJobsController
, it seems like the Start()
method simply starts a transaction and create the scheduled job record. This logic can live within the constructor of the object. Alternatively, if you want to follow the automatic job pattern, then you can kick off a background goroutine here that periodically wake up to check if there exists a schedule job record for your scheduled job.
pkg/sql/scheduled_logging_job.go, line 32 at r4 (raw file):
func init() { RegisterLogResumerAndScheduleExecutor(
Since you can only register jobs within the sql package, perhaps there's no need exporting this function ?
pkg/sql/scheduledloggingjobs/captured_index_usage_stats.go, line 39 at r4 (raw file):
"sql.capture_index_usage_stats.telemetry.interval", "cron-tab interval to capture index usage statistics to the telemetry logging channel", "0 */8 * * *", /* defaultValue */
Hmm, what does this cronexpr mean in english 🤔 ?
pkg/sql/scheduledloggingjobs/captured_index_usage_stats.go, line 92 at r4 (raw file):
expectedNumDatums = 10 var allEvents []eventpb.EventPayload for _, databaseName := range allDatabaseNames {
What happen if the jobs is paused or cancelled while this loop is still running?
pkg/sql/scheduledloggingjobs/scheduled_logging_job_controller.go, line 64 at r4 (raw file):
// createCaptureIndexUsageStatsScheduledJob creates all scheduled logging jobs. // createCaptureIndexUsageStatsScheduledJob implements the sql.Controller interface. func (c *LoggingJobsController) createCaptureIndexUsageStatsScheduledJob(
During cluster startup, every single node will go and attempt to create this scheduled job (and since this schedule already exist, all will fail). This code is synchronous, which means this will block the cluster startup due to contention.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewed 1 of 2 files at r3.
Reviewable status: complete! 0 of 0 LGTMs obtained (waiting on @ajwerner, @miretskiy, and @THardy98)
pkg/sql/scheduledloggingjobs/scheduled_logging_job.go, line 150 at r2 (raw file):
Previously, THardy98 (Thomas Hardy) wrote…
Oh I didn't know that. How does it check whether an instance is of a job is already running?
When you specify CreatedBy field when creating new job, that links the job with the schedule; and the scheduler will (by default) wait for the previous job completion
before starting the new one.
I'm curious: why is this whole thing implemented as a job? Jobs are long running processes, that must survive |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There were two reasons:
- the job system offered an existing way to create a long running scheduled background process
- with the idea to support different log-emitting jobs in the future, some of those jobs may make use other features of the job system like checkpointing (although this job doesn't)
That being said, maybe it's overkill and unlikely that background logging would make use of the job's system effectively.
Reviewable status: complete! 0 of 0 LGTMs obtained (waiting on @ajwerner, @miretskiy, and @THardy98)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we should err on the side of simplicity. If this particular use case doesn't need jobs, then, let's not add another job type and schedule. If feature uses prove the need to have that, then (as can be seen by this PR), unifying the code to use new schedule/job type is fairly straight forward.
Reviewable status: complete! 0 of 0 LGTMs obtained (waiting on @ajwerner, @miretskiy, and @THardy98)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sounds good, I'll close this PR. I've created PR #76886 in its place. Feel free to take a look at that if you like, although it doesn't use the jobs system.
TYFR @miretskiy and @Azhng 👍
Reviewable status: complete! 0 of 0 LGTMs obtained (waiting on @ajwerner, @miretskiy, and @THardy98)
Resolves: #72486
Previously, we did not capture index usage metrics to our telemetry logging
channel. The intention of this change is to introduce a scheduler for jobs that
emit logging events.
Currently, we want to capture index usage statistics to the telemetry logging
channel on a time interval, but being able to emit scheduled logs for different
statistics/events would be useful telemetry going forward, which motivates the
creation of a scheduler that supports different logging jobs.
This change introduces the
scheduledloggingjobs
package.CreateScheduledLoggingJob
inscheduledloggingjobs
, allows you to create ascheduler with parameters for:
The Resumer and Executor logic for the scheduled jobs resides in
sql/scheduled_logging_jobs.go
. The Resumer receives a callback function thatreturns the logging events it wants to emit. The Executor receives the job
details/progress (used to create the job record when the job is executed) and
job type (to check that a job of the same type isn't already running when we
try to create the job).
The
scheduledloggingjobs
package also contains aController
whose purposeis to
Start
all the scheduled logging jobs in its interface. TheController
'sStart
method is hooked into the startup of thesql.Server
.Assuming its
jobs.proto
definitions andwrap.go
cases are covered, a newlogging job can be added to this by:
RegisterLogResumerAndScheduleExecutor
details, execution arguments) and calling
CreateScheduledLoggingJob
Controller
interface, andStart
method (if you want tocreate the scheduled job on server startup)
This change was made for internal use in mind, I didn't consider a SQL
interface to create these schedules.
Release note (sql change): Initial implementation of a general logging job,
used to create a job that captures index usage statistics to the telemetry
logging channel.