-
Notifications
You must be signed in to change notification settings - Fork 3.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
ttl: show table name for jobs in SHOW SCHEDULES #107294
ttl: show table name for jobs in SHOW SCHEDULES #107294
Conversation
I was a bit uncertain about how to best handle certain cases. These are marked as "TODO" and I intend to resolve them before merging. I would appreciate any input! |
pkg/sql/rename_table.go
Outdated
if tableDesc.HasRowLevelTTL() { | ||
const ( | ||
opname = `update ttl job names` | ||
query = `update system.scheduled_jobs set schedule_name = 'row-level-ttl-' || $1 WHERE schedule_id = $2` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: is it better to use capitals and some parenthesis here like
UPDATE system.scheduled_jobs SET schedule_name = ('row-level-ttl-' || $1) WHERE schedule_id = $2
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
pkg/sql/create_table.go
Outdated
sj.SetOwner(owner) | ||
sj.SetScheduleDetails(jobspb.ScheduleDetails{ | ||
Wait: jobspb.ScheduleDetails_WAIT, | ||
// If a job fails, try again at the allocated cron time. | ||
OnError: jobspb.ScheduleDetails_RETRY_SCHED, | ||
}) | ||
|
||
if err := sj.SetSchedule(ttl.DeletionCronOrDefault()); err != nil { | ||
// TODO: Is it better to check that RowLevelTTL is not nil and return an error? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: TODO(username(s)):
here/below.
(If they're to be discussed in review and resolved before merging, can do TODO(BEFORE MERGING) or something to indicate as such)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
+1 to dt's point
Also, If we're setting schedule for a nil TTL, it's a critical bug that we should let it fail hard. but it's not nice to panic. We could return a assertion error with errors.AssertionFailedf
to feel safe.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry for the confusion. This was intended to be a discussion at review time. I'll use TODO(BEFORE MERGING)
in the future!
Also, If we're setting schedule for a nil TTL, it's a critical bug that we should let it fail hard. but it's not nice to panic. We could return a assertion error with errors.AssertionFailedf to feel safe.
Agreed that panicking isn't great. More generally, is the style in CRDB is to assume that programmers are doing the right thing or is it to be paranoid? Because of how this function got refactored and how go's type system works, callers could accidentally cause nil dereferences. I find myself running into these types of cases frequently. In general I prefer to write types in such a way that this can't happen but that's not always on option (See: this PR). My personal rule is to be paranoid at boundaries and expect well shaped data inside of said boundaries. I don't really have a sense of what is or is not considered to be a boundary yet. sql.CreateRowLevelTTLScheduledJob
passes tblDesc
right through, so we could do a check at the package level but this metaphorical boundary might exist somewhere else instead. What would either of you do for this specific situation and does that follow a more generalizable rule?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
mhm, I think you already made good points here, we can do the check in CreateRowLevelTTLScheduledJob
given it's a public function at the package level. I personally often makes assumptions in non-public functions, of course sometimes do such kind of validations in a helper.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks! Added the check to CreateRowLevelTTLScheduleJob
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
lgtm! only minor things
---- | ||
0 | ||
|
||
subtest end | ||
|
||
# Ensure that SHOW SCHEDULES's label column is updated to reflect the new table |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could you add a test case where the schedule name uses id (mock with a manual update in test) and it's changed to use name after renaming the table? Just want to make sure your update query works.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
pkg/sql/rename_table.go
Outdated
@@ -232,6 +233,29 @@ func (n *renameTableNode) startExec(params runParams) error { | |||
return err | |||
} | |||
|
|||
// If this table has row level ttl enabled, update the schedule_name of all | |||
// row-level-ttl jobs with the name table name. | |||
if tableDesc.HasRowLevelTTL() { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: maybe extract this logic into a separate helper function. I'm ok with what it's like now though.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think there's a super clear place to put it right now. Could we save the refactor until rename table is implemented in the declarative schema changer? I think having two consumers will make the eventual "home" of this logic much more clear.
pkg/sql/create_table.go
Outdated
sj.SetOwner(owner) | ||
sj.SetScheduleDetails(jobspb.ScheduleDetails{ | ||
Wait: jobspb.ScheduleDetails_WAIT, | ||
// If a job fails, try again at the allocated cron time. | ||
OnError: jobspb.ScheduleDetails_RETRY_SCHED, | ||
}) | ||
|
||
if err := sj.SetSchedule(ttl.DeletionCronOrDefault()); err != nil { | ||
// TODO: Is it better to check that RowLevelTTL is not nil and return an error? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
+1 to dt's point
Also, If we're setting schedule for a nil TTL, it's a critical bug that we should let it fail hard. but it's not nice to panic. We could return a assertion error with errors.AssertionFailedf
to feel safe.
pkg/sql/rename_table.go
Outdated
opname = `update ttl job names` | ||
query = `update system.scheduled_jobs set schedule_name = 'row-level-ttl-' || $1 WHERE schedule_id = $2` | ||
) | ||
// TODO Should the number of modified rows be inspected here? Could log a |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think renaming a table should handle this. It's part of what the TTL infra itself should check. cc @ecwall to see if there is a strong guarantee here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Okay! I've removed this comment. Happy to revisit if Evan has opinions.
d664445
to
8fc3b5f
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: complete! 0 of 0 LGTMs obtained (waiting on @chengxiong-ruan, @chrisseto, @dt, @ecwall, and @rhu713)
pkg/sql/rename_table.go
line 238 at r1 (raw file):
Previously, chrisseto (Chris Seto) wrote…
I don't think there's a super clear place to put it right now. Could we save the refactor until rename table is implemented in the declarative schema changer? I think having two consumers will make the eventual "home" of this logic much more clear.
Lets move this over to MetaDataUpdater then the declarative side can easily consume it
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: complete! 0 of 0 LGTMs obtained (waiting on @chengxiong-ruan, @chrisseto, @dt, @ecwall, and @rhu713)
pkg/sql/rename_table.go
line 238 at r1 (raw file):
Previously, fqazi (Faizan Qazi) wrote…
Lets move this over to MetaDataUpdater then the declarative side can easily consume it
*MetadataUpdater
Previously, fqazi (Faizan Qazi) wrote…
Good idea. I like that. |
fac4a98
to
933cf8f
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: complete! 0 of 0 LGTMs obtained (waiting on @chengxiong-ruan, @dt, @ecwall, @fqazi, and @rhu713)
pkg/sql/rename_table.go
line 238 at r1 (raw file):
Previously, chengxiong-ruan (Chengxiong Ruan) wrote…
Good idea. I like that.
Oh nice! That's the perfect spot. Moved.
pkg/sql/rename_table.go
line 243 at r1 (raw file):
Previously, chrisseto (Chris Seto) wrote…
Okay! I've removed this comment. Happy to revisit if Evan has opinions.
Done.
933cf8f
to
2b26b9f
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: complete! 1 of 0 LGTMs obtained (waiting on @dt, @ecwall, @fqazi, and @rhu713)
2b26b9f
to
608e989
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: complete! 0 of 0 LGTMs obtained (and 1 stale) (waiting on @chengxiong-ruan, @chrisseto, @dt, @ecwall, @fqazi, and @rhu713)
pkg/sql/create_table.go
line 2434 at r5 (raw file):
) (*jobs.ScheduledJob, error) { sj := jobs.NewScheduledJob(env) sj.SetScheduleLabel(fmt.Sprintf("row-level-ttl-%s", tblDesc.GetName()))
nit: i think we may actually want both the name and ID.
the primary reason is that table names are not unique - the fully qualified name of database_name.schema_name.table_name
is, but it doesn't seem like we have those other names easily available. (and it just seems like a headache to make all those RENAME operations get reflected in this schedule label too.)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: complete! 0 of 0 LGTMs obtained (and 1 stale) (waiting on @chengxiong-ruan, @dt, @ecwall, @fqazi, @rafiss, and @rhu713)
pkg/sql/create_table.go
line 2434 at r5 (raw file):
Previously, rafiss (Rafi Shamim) wrote…
nit: i think we may actually want both the name and ID.
the primary reason is that table names are not unique - the fully qualified name of
database_name.schema_name.table_name
is, but it doesn't seem like we have those other names easily available. (and it just seems like a headache to make all those RENAME operations get reflected in this schedule label too.)
Is there a need to make the label unique? The table ID is available in the "args" column, if disambiguation is needed. I do think that fully qualified name would be much more helpful, I just didn't know how to get it. 😅
root@localhost:36257/defaultdb2> SHOW SCHEDULES;
id | label | schedule_status | next_run | state | recurrence | jobsrunning | owner | created | on_wait | on_error | command
---------------------+----------------------+-----------------+------------------------+-----------+------------+-------------+-------+-------------------------------+---------+-------------+-------------------
883280186681032705 | sql-schema-telemetry | ACTIVE | 2023-07-24 09:37:00+00 | pending | 37 9 * * 1 | 0 | node | 2023-07-17 20:35:20.996534+00 | SKIP | RETRY_SCHED | {}
883280186749747201 | sql-stats-compaction | ACTIVE | 2023-07-21 21:00:00+00 | succeeded | @hourly | 0 | node | 2023-07-17 20:35:21.038388+00 | SKIP | RETRY_SCHED | {}
883280439706157057 | row-level-ttl-104 | ACTIVE | 2023-07-21 21:00:00+00 | NULL | @hourly | 0 | root | 2023-07-17 20:36:38.237751+00 | WAIT | RETRY_SCHED | {"tableId": 104}
885528023598104577 | row-level-ttl-tbl | ACTIVE | 2023-07-25 20:00:00+00 | NULL | @hourly | 0 | root | 2023-07-25 19:08:26.403993+00 | WAIT | RETRY_SCHED | {"tableId": 107}
885528044534595585 | row-level-ttl-tbl | ACTIVE | 2023-07-25 20:00:00+00 | NULL | @hourly | 0 | root | 2023-07-25 19:08:32.797117+00 | WAIT | RETRY_SCHED | {"tableId": 108}
Hmmmmm, you know, I added this example to showcase my point but I think it's made me agree with you. The fully qualified name would be much nicer but we'd have to hook into database and schema renames as well and it would make the labels MUCH longer. Isrow-level-ttl-107-tbl
a good middle ground? If there's no limitation on the format of label, row-level-ttl: 107 (tbl)
or something up that alley might be a bit more pleasant to read.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: complete! 0 of 0 LGTMs obtained (and 1 stale) (waiting on @chengxiong-ruan, @chrisseto, @dt, @ecwall, @fqazi, and @rhu713)
pkg/sql/create_table.go
line 2434 at r5 (raw file):
Previously, chrisseto (Chris Seto) wrote…
Is there a need to make the label unique? The table ID is available in the "args" column, if disambiguation is needed. I do think that fully qualified name would be much more helpful, I just didn't know how to get it. 😅
root@localhost:36257/defaultdb2> SHOW SCHEDULES; id | label | schedule_status | next_run | state | recurrence | jobsrunning | owner | created | on_wait | on_error | command ---------------------+----------------------+-----------------+------------------------+-----------+------------+-------------+-------+-------------------------------+---------+-------------+------------------- 883280186681032705 | sql-schema-telemetry | ACTIVE | 2023-07-24 09:37:00+00 | pending | 37 9 * * 1 | 0 | node | 2023-07-17 20:35:20.996534+00 | SKIP | RETRY_SCHED | {} 883280186749747201 | sql-stats-compaction | ACTIVE | 2023-07-21 21:00:00+00 | succeeded | @hourly | 0 | node | 2023-07-17 20:35:21.038388+00 | SKIP | RETRY_SCHED | {} 883280439706157057 | row-level-ttl-104 | ACTIVE | 2023-07-21 21:00:00+00 | NULL | @hourly | 0 | root | 2023-07-17 20:36:38.237751+00 | WAIT | RETRY_SCHED | {"tableId": 104} 885528023598104577 | row-level-ttl-tbl | ACTIVE | 2023-07-25 20:00:00+00 | NULL | @hourly | 0 | root | 2023-07-25 19:08:26.403993+00 | WAIT | RETRY_SCHED | {"tableId": 107} 885528044534595585 | row-level-ttl-tbl | ACTIVE | 2023-07-25 20:00:00+00 | NULL | @hourly | 0 | root | 2023-07-25 19:08:32.797117+00 | WAIT | RETRY_SCHED | {"tableId": 108}
Hmmmmm, you know, I added this example to showcase my point but I think it's made me agree with you. The fully qualified name would be much nicer but we'd have to hook into database and schema renames as well and it would make the labels MUCH longer. Is
row-level-ttl-107-tbl
a good middle ground? If there's no limitation on the format of label,row-level-ttl: 107 (tbl)
or something up that alley might be a bit more pleasant to read.
right, your example is what i had in mind. i think both ID and table name is good to be both helpful enough and provide full disambiguation for the users who do have the same table names in different places. a convention we use sometimes is to put the ID in square brackets. something like row-level-ttl: table_name [107]
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: complete! 0 of 0 LGTMs obtained (and 1 stale) (waiting on @chengxiong-ruan, @dt, @ecwall, @fqazi, @rafiss, and @rhu713)
pkg/sql/create_table.go
line 2434 at r5 (raw file):
Previously, rafiss (Rafi Shamim) wrote…
right, your example is what i had in mind. i think both ID and table name is good to be both helpful enough and provide full disambiguation for the users who do have the same table names in different places. a convention we use sometimes is to put the ID in square brackets. something like
row-level-ttl: table_name [107]
I like that! I'll go ahead and pull this all into a function in the TTL package so it's easier to refactor in the future.
608e989
to
d3ed6c1
Compare
d3ed6c1
to
8f50475
Compare
Previously, `SHOW SCHEDULES` would only show a table's ID for row level TTL jobs. For the convenience of debugging, this commit upgrades the `label` column to contain the table's name instead of its ID. Rather than performing table name resolution at query time, the job name is set at creation time and updated by the schema change during table renames. This minimizes the likelihood of `SHOW SCHEDULES` being rendered unusable when a cluster is in an unstable state. Epic: CRDB-18322 Fixes: 93774
8f50475
to
fc59204
Compare
Friendly ping @rafiss :) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this looks good! thanks for the ping
Reviewed 2 of 7 files at r1, 1 of 4 files at r2, 16 of 16 files at r6, all commit messages.
Reviewable status: complete! 0 of 0 LGTMs obtained (and 1 stale) (waiting on @chengxiong-ruan, @dt, @ecwall, @fqazi, and @rhu713)
TFTR! bors r+ |
Hmm, I just realized that there likely isn't going to be a test that will fail once rename table is implemented in the declarative schema changer. I'll find or find an issue to make sure this doesn't get forgotten. |
Build succeeded: |
Previously,
SHOW SCHEDULES
would only show a table's ID for row level TTL jobs. For the convenience of debugging, this commit upgrades thelabel
column to contain the table's name instead of its ID.Rather than performing table name resolution at query time, the job name is set at creation time and updated by the schema change during table renames. This minimizes the likelihood of
SHOW SCHEDULES
being rendered unusable when a cluster is in an unstable state.Epic: CRDB-18322
Fixes: #93774