-
Notifications
You must be signed in to change notification settings - Fork 3.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
rfc: row level TTL #75189
rfc: row level TTL #75189
Conversation
cc @ajwerner @dt @nvanbenschoten if you wanted to take a look |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewed 1 of 1 files at r1, all commit messages.
Reviewable status: complete! 0 of 0 LGTMs obtained (waiting on @ajstorm, @otan, @rafiss, and @vy-ton)
docs/RFCS/20220120_row_level_ttl.md, line 23 at r1 (raw file):
id INT PRIMARY KEY, text TEXT ) TTL '5 minutes'
Is this a standard PostgreSQL syntax?
If yes, please disregard the rest of my comment.
Otherwise, I'd like to recommend to not add syntax to CREATE, and instead focus on (and promote) the ALTER syntax.
There are a few reasons for this.
-
the argument after
TTL
will be (sometimes, like in this example) an interval value. The Interval syntax uses keywords (including TO and ON) which are at risk of becoming ambiguous with other syntax clauses that PosgreSQL allows at the end of CREATE. -
we'll likely want to support placeholders and other types of expression for the value parts of the K/V pair syntax presented below, to feed in constants and other computations. However, placeholder / expression support in CREATE is a more difficult endeavor than other statements IIRC.
-
we know from experience that it's hard in ORMs and other client-side SQL frameworks to add clauses to CREATE statements. However, these are OK with ALTER.
-
I'm not sure what adding the support in CREATE brings, given that the ALTER can be run immediately after on an empty table. One syntax extension / code path is simpler to implement/maintain than two.
docs/RFCS/20220120_row_level_ttl.md, line 30 at r1 (raw file):
**Note**: this does NOT cover partition level deletes, where deletes can be *"accelerated" by deleting entire "partitions" at a time.
nit: stray *
that breaks the formatting below
docs/RFCS/20220120_row_level_ttl.md, line 86 at r1 (raw file):
id INT PRIMARY KEY, text TEXT, crdb_internal_expiration TIMESTAMPTZ NOT NULL DEFAULT current_timestamp() + '5 minutes' ON UPDATE current_timestamp() + '5 minutes'
here and below, can you add the NOT VISIBLE clause to highlight the column is hidden
docs/RFCS/20220120_row_level_ttl.md, line 98 at r1 (raw file):
text TEXT, other_ts_field TIMESTAMPTZ ) TTL (expiration_column = 'other_ts_field')
why the string literal syntax? Can't we use an identifier here?
docs/RFCS/20220120_row_level_ttl.md, line 147 at r1 (raw file):
However, due to the limitation that `ON UPDATE` cannot reference a table expression, this cannot yet be implemented. This can change when we implemented
nit: when we implement
docs/RFCS/20220120_row_level_ttl.md, line 157 at r1 (raw file):
```sql ALTER TABLE tbl SET TTL; -- introduce a TTL. no-op if there is already a TTL
I don't understand what the clause does if there's no specification afterwards. What does "introduce a TTL" mean if there's no expiration?
docs/RFCS/20220120_row_level_ttl.md, line 163 at r1 (raw file):
Note initially any options applied to the TTL in regards to the job will not
what is "the job" in this sentence? You've not defined it yet.
docs/RFCS/20220120_row_level_ttl.md, line 185 at r1 (raw file):
job gets created (or modified) by the syntax detailed in the previous section. The deletes are a SQL-based delete, meaning they will show up on changefeeds as a regular delete by the root user.
Would it make sense to register another special user for background privilege tasks, member of the admin group?
We'll want to distinguish these activities in telemetry somehow. A different user would do it, perhaps also other attributes.
docs/RFCS/20220120_row_level_ttl.md, line 191 at r1 (raw file):
On each job invocation, the deletion job performs the following: * Establish a "start time". * Paginate over all ranges for the table. Allocate these to a "range worker pool".
Would it make sense to tweak the behavior of the job, if there's an index on the TTL column(s)? In that case, there's no need to scan all rows.
docs/RFCS/20220120_row_level_ttl.md, line 200 at r1 (raw file):
* Issue a SQL DELETE up to delete_batch_size entries from the table using the PRIMARY KEYs selected above.
Throughout the RFC: I see 3 stray spaces on an empty line sometimes.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Great to see this, and nice job putting it together! 🎉
Reviewable status: complete! 0 of 0 LGTMs obtained (waiting on @knz, @otan, @rafiss, and @vy-ton)
docs/RFCS/20220120_row_level_ttl.md, line 12 at r1 (raw file):
Row-level "time to live" (TTL) is a mechanism in which rows from a table automatically get deleted once the row surpasses an expiration time (the "TTL"). This has been a [feature commonly asked for](20239).
Nit: this link is broken.
docs/RFCS/20220120_row_level_ttl.md, line 16 at r1 (raw file):
This RFC proposes a CockroachDB level mechanism to support row-level TTL, where rows will be deleted after a certain period of time. As a further extension in a later release, rows will automatically be hidden after expiring a TTL.
Nit: for clarity here, does it make sense to change this to "...rows will be automatically hidden after they've expired their TTL and before they've been physically deleted"
docs/RFCS/20220120_row_level_ttl.md, line 23 at r1 (raw file):
Previously, knz (kena) wrote…
Is this a standard PostgreSQL syntax?
If yes, please disregard the rest of my comment.
Otherwise, I'd like to recommend to not add syntax to CREATE, and instead focus on (and promote) the ALTER syntax.
There are a few reasons for this.
the argument after
TTL
will be (sometimes, like in this example) an interval value. The Interval syntax uses keywords (including TO and ON) which are at risk of becoming ambiguous with other syntax clauses that PosgreSQL allows at the end of CREATE.we'll likely want to support placeholders and other types of expression for the value parts of the K/V pair syntax presented below, to feed in constants and other computations. However, placeholder / expression support in CREATE is a more difficult endeavor than other statements IIRC.
we know from experience that it's hard in ORMs and other client-side SQL frameworks to add clauses to CREATE statements. However, these are OK with ALTER.
I'm not sure what adding the support in CREATE brings, given that the ALTER can be run immediately after on an empty table. One syntax extension / code path is simpler to implement/maintain than two.
Alternatively, we could go with the standard PG mechanism of using a storage_parameter
. Following the PG syntax, this would look like:
CREATE TABLE tbl (
id INT PRIMARY KEY,
test TEXT
) WITH (ttl_expiry = '5 minutes')
docs/RFCS/20220120_row_level_ttl.md, line 75 at r1 (raw file):
id INT PRIMARY KEY, text TEXT ) TTL (expiration_time = '5 minutes', delete_batch_size = 1, job_cron = '* * 1 * *')
As mentioned, I think we should go with WITH (ttl_foo = X, ttl_bar = Y, ...)
docs/RFCS/20220120_row_level_ttl.md, line 121 at r1 (raw file):
// DurationExpr is the automatically assigned interval for when the TTL should apply to a row. optional string duration_expr = 1 [(gogoproto.nullable)=false]; optional string deletion_cron = 2 [(gogoproto.nullable)=false];
What's DeletionCron for?
docs/RFCS/20220120_row_level_ttl.md, line 131 at r1 (raw file):
User override for the
crdb_internal_expiration
column
Nit: For readability, you might want to move this section lower down where we talk about extensions post-MVP.
docs/RFCS/20220120_row_level_ttl.md, line 160 at r1 (raw file):
ALTER TABLE tbl SET TTL (expiration_column = 'other_ts_field'); -- introduce a TTL with the given option OR apply the given option to the TTL ALTER TABLE tbl SET TTL (expiration_column = NULL) -- drop an option from the TTL
Does this turn off TTL? If so, perhaps we can use the RESET
syntax that PG has for storage parameters instead.
docs/RFCS/20220120_row_level_ttl.md, line 171 at r1 (raw file):
```sql ALTER TABLE tbl DROP TTL
Nit: RESET?
docs/RFCS/20220120_row_level_ttl.md, line 174 at r1 (raw file):
The DROP will not drop any TTL columns. However, recreating the TTL column will
I thought that we reversed course on this, and now want to drop the column since leaving it around doesn't provide much value.
docs/RFCS/20220120_row_level_ttl.md, line 177 at r1 (raw file):
not succeed as it may risk expiring all the rows immediately. Instead, the user must either DROP the existing TTL columns or specify it as the `ttl_column` option. A hint will be provided detailing this. Deletion Process When a TTL
Was this last sentence intentional? Is it the error message/hint? Should it be referring to the column instead of the job?
docs/RFCS/20220120_row_level_ttl.md, line 194 at r1 (raw file):
* Then, for each range worker performs the following: * Traverse the range in order of PRIMARY KEY: * Issue a SQL SELECT AS OF SYSTEM TIME '-30s' query on the PRIMARY KEYs up to
Nit: Are we going to limit the setting of TTL, say to a minimum of 5 minutes? A very short TTL (5 seconds) would cause problems if the job is only running every 30 seconds.
docs/RFCS/20220120_row_level_ttl.md, line 204 at r1 (raw file):
To ensure the deletion job does not affect foreground traffic, we plan on using [admission control](admission control) on a SQL transaction at the lowest possible value (-256).
Nit: To leave room for other jobs in the future which may be even lower priority, perhaps we want to hedge a bit, say by using -200?
docs/RFCS/20220120_row_level_ttl.md, line 232 at r1 (raw file):
Option | Description --- | --- select_batch_size | How many rows to fetch from the range that have expired at a given time. Defaults to 500.
If we go with the PG WITH
syntax, all of these should be prefaced by ttl_
docs/RFCS/20220120_row_level_ttl.md, line 234 at r1 (raw file):
select_batch_size | How many rows to fetch from the range that have expired at a given time. Defaults to 500. delete_batch_size | How many rows to delete at a time. Defaults to 100. range_concurrency | How many concurrent ranges are being worked on at a time. Defaults to 8.
Nit: Should this default to some multiple of core count?
docs/RFCS/20220120_row_level_ttl.md, line 252 at r1 (raw file):
Rows that have expired their TTL can be optionally removed from all SQL query results. However, this is not planned to be implemented for the first release of TTL.
Do we have issues open for all of the pieces we're not planning on implementing in the MVP? If so, might be nice to reference them here.
docs/RFCS/20220120_row_level_ttl.md, line 259 at r1 (raw file):
DELETE/ON UPDATE CASCADEs before we can look at allowing this functionality. ## Observability
Might be good to get @maryliag's thoughts on this section.
docs/RFCS/20220120_row_level_ttl.md, line 261 at r1 (raw file):
## Observability As mentioned above, we expect users will need to tinker with the deletion job to ensure foreground traffic is not affected.
Was this statement written before the admission control work? I was under the impression that we're no longer expecting users to have to tinker with these jobs.
docs/RFCS/20220120_row_level_ttl.md, line 271 at r1 (raw file):
progress: * selection and deletion rate * selection and deletion latency
Any thought as to how these will be exposed? Something like SHOW TTL STATISTICS FOR TABLE foo
?
docs/RFCS/20220120_row_level_ttl.md, line 314 at r1 (raw file):
the experience of this is perceived as a little clunky in terms of enabling or disabling TTL compared to the currently proposed syntax as it is unclear that the table columns themselves may be modified.
Can you say more about this last part? I'm of the opinion that we should draft off of PG unless the syntax is really unusable. I'm not seeing that here, but it's possible I'm missing something.
docs/RFCS/20220120_row_level_ttl.md, line 324 at r1 (raw file):
secondary index entry before the delete on the primary index entry. This is predicated on filtering out expired rows working. * Admission control improvements: In future, we can explore applying
This seems to be outdated.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: complete! 0 of 0 LGTMs obtained (waiting on @ajstorm, @knz, @otan, @rafiss, and @vy-ton)
docs/RFCS/20220120_row_level_ttl.md, line 23 at r1 (raw file):
Previously, ajstorm (Adam Storm) wrote…
Alternatively, we could go with the standard PG mechanism of using a
storage_parameter
. Following the PG syntax, this would look like:CREATE TABLE tbl ( id INT PRIMARY KEY, test TEXT ) WITH (ttl_expiry = '5 minutes')
I'm not a fan of the storage parameter syntax, see below.
the ALTER is interesting; i guess, how does this affect not allowing this syntax in CREATE? ORMs can't exactly use SHOW CREATE TABLE, so it just seems like additional syntactical sugar for me to introduce the TTL.
docs/RFCS/20220120_row_level_ttl.md, line 75 at r1 (raw file):
Previously, ajstorm (Adam Storm) wrote…
As mentioned, I think we should go with
WITH (ttl_foo = X, ttl_bar = Y, ...)
I disagree, and we should have that discussion in the relevant drawback section below.
docs/RFCS/20220120_row_level_ttl.md, line 86 at r1 (raw file):
Previously, knz (kena) wrote…
here and below, can you add the NOT VISIBLE clause to highlight the column is hidden
my bad, added!
docs/RFCS/20220120_row_level_ttl.md, line 98 at r1 (raw file):
Previously, knz (kena) wrote…
why the string literal syntax? Can't we use an identifier here?
there are many different option types, was hoping to use the postgres-esque (key=value, key=value)
syntax here.
docs/RFCS/20220120_row_level_ttl.md, line 121 at r1 (raw file):
Previously, ajstorm (Adam Storm) wrote…
What's DeletionCron for?
Done.
docs/RFCS/20220120_row_level_ttl.md, line 157 at r1 (raw file):
Previously, knz (kena) wrote…
I don't understand what the clause does if there's no specification afterwards. What does "introduce a TTL" mean if there's no expiration?
we give it a default. appended text
docs/RFCS/20220120_row_level_ttl.md, line 160 at r1 (raw file):
Previously, ajstorm (Adam Storm) wrote…
Does this turn off TTL? If so, perhaps we can use the
RESET
syntax that PG has for storage parameters instead.
Done.
docs/RFCS/20220120_row_level_ttl.md, line 163 at r1 (raw file):
Previously, knz (kena) wrote…
what is "the job" in this sentence? You've not defined it yet.
Done.
docs/RFCS/20220120_row_level_ttl.md, line 171 at r1 (raw file):
Previously, ajstorm (Adam Storm) wrote…
Nit: RESET?
if we do what we have, DROP is more standard.
docs/RFCS/20220120_row_level_ttl.md, line 174 at r1 (raw file):
Previously, ajstorm (Adam Storm) wrote…
I thought that we reversed course on this, and now want to drop the column since leaving it around doesn't provide much value.
Done.
docs/RFCS/20220120_row_level_ttl.md, line 177 at r1 (raw file):
Previously, ajstorm (Adam Storm) wrote…
Was this last sentence intentional? Is it the error message/hint? Should it be referring to the column instead of the job?
Done.
docs/RFCS/20220120_row_level_ttl.md, line 185 at r1 (raw file):
Previously, knz (kena) wrote…
Would it make sense to register another special user for background privilege tasks, member of the admin group?
We'll want to distinguish these activities in telemetry somehow. A different user would do it, perhaps also other attributes.
that's a good idea, but i've added it as a future improvement.
docs/RFCS/20220120_row_level_ttl.md, line 191 at r1 (raw file):
Previously, knz (kena) wrote…
Would it make sense to tweak the behavior of the job, if there's an index on the TTL column(s)? In that case, there's no need to scan all rows.
yes, i'll leave this as a TODO optimisation.
docs/RFCS/20220120_row_level_ttl.md, line 200 at r1 (raw file):
Previously, knz (kena) wrote…
Throughout the RFC: I see 3 stray spaces on an empty line sometimes.
Done.
docs/RFCS/20220120_row_level_ttl.md, line 232 at r1 (raw file):
Previously, ajstorm (Adam Storm) wrote…
If we go with the PG
WITH
syntax, all of these should be prefaced byttl_
Done.
docs/RFCS/20220120_row_level_ttl.md, line 252 at r1 (raw file):
Previously, ajstorm (Adam Storm) wrote…
Do we have issues open for all of the pieces we're not planning on implementing in the MVP? If so, might be nice to reference them here.
I can open them afterwards.
docs/RFCS/20220120_row_level_ttl.md, line 261 at r1 (raw file):
Previously, ajstorm (Adam Storm) wrote…
Was this statement written before the admission control work? I was under the impression that we're no longer expecting users to have to tinker with these jobs.
foreground traffic can still be affected, that's why we add monitoring.
modifying "will" to "may".
docs/RFCS/20220120_row_level_ttl.md, line 271 at r1 (raw file):
Previously, ajstorm (Adam Storm) wrote…
Any thought as to how these will be exposed? Something like
SHOW TTL STATISTICS FOR TABLE foo
?
my thoughts were DB console, added.
docs/RFCS/20220120_row_level_ttl.md, line 314 at r1 (raw file):
Previously, ajstorm (Adam Storm) wrote…
Can you say more about this last part? I'm of the opinion that we should draft off of PG unless the syntax is really unusable. I'm not seeing that here, but it's possible I'm missing something.
I'm of the opinion it goes beyond what the storage_parameter
pg syntax was intended for, and the schema changes / behaviour applied warrants it.
I guess I'll let @vy-ton dictate the final decision (not hard to replace all the usages)
docs/RFCS/20220120_row_level_ttl.md, line 324 at r1 (raw file):
Previously, ajstorm (Adam Storm) wrote…
This seems to be outdated.
Done.
c8a66c7
to
de599bf
Compare
guess we're going with the new syntax, hehe |
5976e8e
to
d61a163
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks about good to me, with some suggestion to consider using the DEFAULT keyword where appropriate.
(We've done so in the past in other places.)
Reviewed 1 of 1 files at r3, all commit messages.
Reviewable status: complete! 0 of 0 LGTMs obtained (waiting on @ajstorm, @otan, @rafiss, and @vy-ton)
docs/RFCS/20220120_row_level_ttl.md, line 23 at r1 (raw file):
Previously, otan (Oliver Tan) wrote…
I'm not a fan of the storage parameter syntax, see below.
the ALTER is interesting; i guess, how does this affect not allowing this syntax in CREATE? ORMs can't exactly use SHOW CREATE TABLE, so it just seems like additional syntactical sugar for me to introduce the TTL.
Yes my point is more that for a large majority of our users, they cannot control the syntax used for CREATE TABLE as much as they can control ALTER (or issue loose ALTER statements through their framework's eec function). Frameworks tend to "own" table creation.
docs/RFCS/20220120_row_level_ttl.md, line 171 at r1 (raw file):
Previously, otan (Oliver Tan) wrote…
if we do what we have, DROP is more standard.
RESET is objectively worse IMHO.
Here's the choice in terms of idiomatic SQL:
- clause-based things
ALTER ... ADD X ...
vsALTER DROP X
- attribute-based things:
WITH x = y
vsWITH x = DEFAULT
Judicious use of the DEFAULT keyword also allows multiple assignments inside one statement, some to set custom values and others to reset to defaults.
docs/RFCS/20220120_row_level_ttl.md, line 17 at r3 (raw file):
rows will be deleted after a certain period of time. As a further extension in a later release, rows rows will be automatically hidden after they've expired their TTL and before they've been physically deleted
nit: missing period.
docs/RFCS/20220120_row_level_ttl.md, line 166 at r3 (raw file):
The deletes are a SQL-based delete, meaning they will show up on changefeeds as a regular delete by the root user (we may choose to create a new user for this in future).
nit: the future
docs/RFCS/20220120_row_level_ttl.md, line 336 at r3 (raw file):
) TTL (expire_after = '5 mins', delete_batch_size = 1, job_cron = '* * 1 * *'); -- adding a new TTL ALTER TABLE ttl SET TTL;
Can you add a comment that explains what default gets set here?
docs/RFCS/20220120_row_level_ttl.md, line 171 at r1 (raw file): Previously, knz (kena) wrote…
Reset is the PG syntax for this so I'm hesitant to change it. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: complete! 0 of 0 LGTMs obtained (waiting on @ajstorm, @knz, @otan, @rafiss, @RichardJCai, and @vy-ton)
docs/RFCS/20220120_row_level_ttl.md, line 23 at r1 (raw file):
Previously, knz (kena) wrote…
Yes my point is more that for a large majority of our users, they cannot control the syntax used for CREATE TABLE as much as they can control ALTER (or issue loose ALTER statements through their framework's eec function). Frameworks tend to "own" table creation.
Done.
docs/RFCS/20220120_row_level_ttl.md, line 233 at r3 (raw file):
Previously, RichardJCai (Richard Cai) wrote…
Do we have an idea of how to implement a mechanism to hide expired rows? I'm assuming it'll be implemented in the optimizer?
Yeah,
docs/RFCS/20220120_row_level_ttl.md, line 237 at r3 (raw file):
Previously, RichardJCai (Richard Cai) wrote…
The subheading level here is kind of strange, made me think that this was a subsection of
Filtering out Expired Rows
ah yeah i moved this one section lower than it should've been.
docs/RFCS/20220120_row_level_ttl.md, line 260 at r3 (raw file):
Previously, RichardJCai (Richard Cai) wrote…
Are there databases out there that support TTL and foreign keys together?
not really; relational databases offering TTL also basically doesn't exist
docs/RFCS/20220120_row_level_ttl.md, line 336 at r3 (raw file):
Previously, knz (kena) wrote…
Can you add a comment that explains what default gets set here?
Done.
61e729c
to
d157a1c
Compare
e81a080
to
535fbc9
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: complete! 1 of 0 LGTMs obtained (waiting on @ajwerner, @knz, @nvanbenschoten, @otan, @rafiss, @RichardJCai, and @vy-ton)
docs/RFCS/20220120_row_level_ttl.md, line 159 at r8 (raw file):
Previously, nvanbenschoten (Nathan VanBenschoten) wrote…
Will this pagination use a fixed query plan, or will it pass through the optimizer? Passing through the optimizer has the advantage that it would automatically benefit from a user-specified index (could be hash-sharded) on
crdb_internal_expiration
if a user wanted to add one.
we re-issue the query with a cursor, if that's what you mean.
docs/RFCS/20220120_row_level_ttl.md, line 166 at r8 (raw file):
Previously, nvanbenschoten (Nathan VanBenschoten) wrote…
How does this avoid racing with concurrent UPDATEs to rows that were at one point expired? Is this DELETE conditional on the
crdb_internal_expiration
staying the same?
we add a AND crdb_internal_expiration < now()
clause during the delete. added commentary
docs/RFCS/20220120_row_level_ttl.md, line 218 at r8 (raw file):
Previously, nvanbenschoten (Nathan VanBenschoten) wrote…
Did we explore a dynamic batch size at any point? Related to my previous comment, you could imagine a scheme where the batch size of low-priotiy DELETEs grows dynamically until it begins hitting contention and retrying, at which point it drops down to a smaller batch size.
nope, i can add that as a future improvement.
docs/RFCS/20220120_row_level_ttl.md, line 265 at r8 (raw file):
Previously, nvanbenschoten (Nathan VanBenschoten) wrote…
I understand why foreign keys from TTL tables are complex, but what makes foreign keys to TTL tables complex? Do we just not think that case is useful without the other?
stuff like REFERENCES ttl_table ON DELETE CASCADE, where for strict TTL we ideally need to apply these delete cascades when the row is expired.
3344bb8
to
d815af6
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm cool with merging this RFC. I have some outstanding desires to decouple the column from the feature a tad and to reduce some of the available flexibility on the automatically created column. My desires are laid out in detail in comments. Otherwise,
I suspect think there's some minor details in the SET
/RESET
semantics when they're in the same statement to be worked out, but that's not for this document.
Reviewable status: complete! 1 of 0 LGTMs obtained (and 1 stale) (waiting on @knz, @nvanbenschoten, @otan, @rafiss, @RichardJCai, and @vy-ton)
docs/RFCS/20220120_row_level_ttl.md, line 212 at r4 (raw file):
Err:
> ALTER TABLE tbl RESET (ttl_automatic_column);
NOTICE: ttl_expire_after has been removed and the crdb_internal_expiration column has been dropped
ALTER TABLE
> SHOW CREATE TABLE tbl;
CREATE TABLE tbl (
i INT PRIMARY KEY,
should_delete BOOL,
) WITH (
ttl_enabled = 'true',
ttl_expiration_expression = 'expiration'
)
docs/RFCS/20220120_row_level_ttl.md, line 147 at r9 (raw file):
### Limitations on the `crdb_internal_expiration` column * The column can be used in indexes, constraints and primary keys. * The ON UPDATE and ON DELETE may be overriden, but is rewritten if
I don't like this. I'd much prefer that this column can't be modified. The ttl_expiration_expression
gives you all the flexibility you need.
Code quote:
* The ON UPDATE and ON DELETE may be overriden, but is rewritten if
`ttl_expire_after` is set.
docs/RFCS/20220120_row_level_ttl.md, line 239 at r9 (raw file):
planned to be implemented for the first release of TTL. ## Overriding the default value for the `crdb_internal_expiration` column
I find this framing a bit reductive. ttl_expiration_expression
is more general than this framing makes it seem.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: complete! 1 of 0 LGTMs obtained (and 1 stale) (waiting on @ajwerner, @knz, @nvanbenschoten, @otan, @rafiss, and @RichardJCai)
docs/RFCS/20220120_row_level_ttl.md, line 212 at r4 (raw file):
Some thoughts:
- For
ttl_enabled
andttl_expire_after
no concerns - It's not clear to me why we need
ttl_automatic_column
? Doesnt make sense to me that this is another way to remove TTL.
ALTER TABLE tbl RESET (ttl_automatic_column);
NOTICE: ttl_expire_after has been removed and the crdb_internal_expiration column has been dropped
- Is this also the error if I change the TTL from 30d to 60d with
ALTER TABLE tbl SET (ttl_expire_after = '30d')
> ALTER TABLE tbl RESET (ttl_expire_after)
ERROR: cannot remove ttl_expire_after without removing the ttl_automatic_column `
docs/RFCS/20220120_row_level_ttl.md, line 245 at r7 (raw file):
Previously, otan (Oliver Tan) wrote…
@vy-ton me and @ajwerner discussed and came up with this as a proposal for the "override" column, which is really now an "override expression". i think the idea gives us a bunch of flexibility & easy enough for v1, wdyt?
I think this makes sense. Its an expression that has to output a timestamp or NULL?
docs/RFCS/20220120_row_level_ttl.md, line 146 at r9 (raw file):
### Limitations on the `crdb_internal_expiration` column * The column can be used in indexes, constraints and primary keys.
Should we allow this?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: complete! 1 of 0 LGTMs obtained (and 1 stale) (waiting on @ajwerner, @knz, @nvanbenschoten, @otan, @rafiss, @RichardJCai, and @vy-ton)
docs/RFCS/20220120_row_level_ttl.md, line 212 at r4 (raw file):
It's not clear to me why we need ttl_automatic_column? Doesn't make sense to me that this is another way to remove TTL.
The idea is to seamlessly support transitioning from usage in example 1 (automatic) to example 2 (user controlled).
Is this also the error if I change the TTL from 30d to 60d with ALTER TABLE tbl SET (ttl_expire_after = '30d')
No, not at all, that's legit. It's saying update the definition of the automatic column to have an ON UPDATE
and DEFAULT
expression with 30d
instead of 60d
or whatever.
The principle of what I'm pitching here is that the automatic column is one facet of a larger subsystem we're providing here. We should be able to have the column or not have the column, and everything else should still work. The ttl_expire_after
is only relevant in the context of that automatic column. You can update ttl_expire_after
and it will update the definition of the expressions in the column. You can drop the column by disabling it.
In the ALTER TABLE tbl RESET (ttl_automatic_column);
case, if you've got a ttl_expiration_expression
that does not refer to crdb_internal_expiration
then it's legit to drop that column and key everything else working as it should. Now, you might think, why don't we just drop the column when the expression stops referencing it? Well, to me, that's too magical. It wouldn't be obvious to me that it's a destructive operation.
I also think it's weird to make the command for dropping the whole subsystem ALTER TABLE tbl RESET (ttl_expire_after);
because ttl_expire_after
is just one parameter that has to do with the automatic column. Hence my pitch for exactly one way to remove all the ttl
things: ALTER TABLE tbl RESET (ttl_enabled)
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: complete! 0 of 0 LGTMs obtained (and 2 stale) (waiting on @ajwerner, @knz, @nvanbenschoten, @otan, @rafiss, @RichardJCai, and @vy-ton)
docs/RFCS/20220120_row_level_ttl.md, line 245 at r7 (raw file):
Previously, vy-ton (Vy Ton) wrote…
I think this makes sense. Its an expression that has to output a timestamp or NULL?
yep
docs/RFCS/20220120_row_level_ttl.md, line 146 at r9 (raw file):
Previously, vy-ton (Vy Ton) wrote…
Should we allow this?
Yep! But now that I think about it, maybe it doesn't make sense to drop the column if TTL is dropped if it is used in this way.
docs/RFCS/20220120_row_level_ttl.md, line 147 at r9 (raw file):
Previously, ajwerner wrote…
I don't like this. I'd much prefer that this column can't be modified. The
ttl_expiration_expression
gives you all the flexibility you need.
i can buy that, will leave it to the below discussion regarding syntax to finalise this
docs/RFCS/20220120_row_level_ttl.md, line 239 at r9 (raw file):
Previously, ajwerner wrote…
I find this framing a bit reductive.
ttl_expiration_expression
is more general than this framing makes it seem.
Done.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM - that said I'll be curious to see how the discussion around the decoupling mentioned by Andrew resolves.
Reviewed 1 of 1 files at r10, all commit messages.
Reviewable status: complete! 0 of 0 LGTMs obtained (and 2 stale) (waiting on @ajwerner, @nvanbenschoten, @otan, @rafiss, @RichardJCai, and @vy-ton)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: complete! 0 of 0 LGTMs obtained (and 2 stale) (waiting on @ajwerner, @nvanbenschoten, @otan, @rafiss, @RichardJCai, and @vy-ton)
docs/RFCS/20220120_row_level_ttl.md, line 212 at r4 (raw file):
Previously, ajwerner wrote…
It's not clear to me why we need ttl_automatic_column? Doesn't make sense to me that this is another way to remove TTL.
The idea is to seamlessly support transitioning from usage in example 1 (automatic) to example 2 (user controlled).
Is this also the error if I change the TTL from 30d to 60d with ALTER TABLE tbl SET (ttl_expire_after = '30d')
No, not at all, that's legit. It's saying update the definition of the automatic column to have an
ON UPDATE
andDEFAULT
expression with30d
instead of60d
or whatever.
The principle of what I'm pitching here is that the automatic column is one facet of a larger subsystem we're providing here. We should be able to have the column or not have the column, and everything else should still work. The
ttl_expire_after
is only relevant in the context of that automatic column. You can updatettl_expire_after
and it will update the definition of the expressions in the column. You can drop the column by disabling it.In the
ALTER TABLE tbl RESET (ttl_automatic_column);
case, if you've got attl_expiration_expression
that does not refer tocrdb_internal_expiration
then it's legit to drop that column and key everything else working as it should. Now, you might think, why don't we just drop the column when the expression stops referencing it? Well, to me, that's too magical. It wouldn't be obvious to me that it's a destructive operation.I also think it's weird to make the command for dropping the whole subsystem
ALTER TABLE tbl RESET (ttl_expire_after);
becausettl_expire_after
is just one parameter that has to do with the automatic column. Hence my pitch for exactly one way to remove all thettl
things:ALTER TABLE tbl RESET (ttl_enabled)
.
I may be alone here, but I have a mild concern that RESET ttl_enabled
will be confused with ttl_pause
for users who don't read the docs closely. In that case, they may try to temporarily disable TTL using RESET (ttl_enabled)
and only later realize the impact of their decision (that the column was dropped and all historical TTL information lost). What about something like ttl_initialize(d)
or ttl_create(d)
instead of ttl_enabled
? The opposite of initialize/create seems more dramatic than the opposite of enabled.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: complete! 0 of 0 LGTMs obtained (and 2 stale) (waiting on @ajstorm, @ajwerner, @nvanbenschoten, @otan, @rafiss, @RichardJCai, and @vy-ton)
docs/RFCS/20220120_row_level_ttl.md, line 212 at r4 (raw file):
Previously, ajstorm (Adam Storm) wrote…
I may be alone here, but I have a mild concern that
RESET ttl_enabled
will be confused withttl_pause
for users who don't read the docs closely. In that case, they may try to temporarily disable TTL usingRESET (ttl_enabled)
and only later realize the impact of their decision (that the column was dropped and all historical TTL information lost). What about something likettl_initialize(d)
orttl_create(d)
instead ofttl_enabled
? The opposite of initialize/create seems more dramatic than the opposite of enabled.
I certainly buy that. I'm far from wedded to that name. How about ttl_in_use
? Or even just ttl
as the variable and the value as something that a user is unlikely to type like: subsystem initialized
. In all of my examples above that variable is implied by the others in the SET
clause. This way the removal of all ttl
features would be: ALTER TABLE tbl RESET (ttl)
. WDYT?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: complete! 0 of 0 LGTMs obtained (and 2 stale) (waiting on @ajwerner, @nvanbenschoten, @otan, @rafiss, @RichardJCai, and @vy-ton)
docs/RFCS/20220120_row_level_ttl.md, line 212 at r4 (raw file):
Previously, ajwerner wrote…
I certainly buy that. I'm far from wedded to that name. How about
ttl_in_use
? Or even justttl
as the variable and the value as something that a user is unlikely to type like:subsystem initialized
. In all of my examples above that variable is implied by the others in theSET
clause. This way the removal of allttl
features would be:ALTER TABLE tbl RESET (ttl)
. WDYT?
I do like ttl
for how it looks with RESET
. Explicitly enabling it is a bit strange, but I can live with that.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: complete! 0 of 0 LGTMs obtained (and 2 stale) (waiting on @ajstorm, @ajwerner, @nvanbenschoten, @otan, @rafiss, and @RichardJCai)
docs/RFCS/20220120_row_level_ttl.md, line 212 at r4 (raw file):
Hence my pitch for exactly one way to remove all the ttl things: ALTER TABLE tbl RESET (ttl_enabled)
That makes sense, I'm on board
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: complete! 0 of 0 LGTMs obtained (and 2 stale) (waiting on @ajstorm, @ajwerner, @otan, @rafiss, and @RichardJCai)
docs/RFCS/20220120_row_level_ttl.md, line 159 at r8 (raw file):
Previously, otan (Oliver Tan) wrote…
we re-issue the query with a cursor, if that's what you mean.
I'm just wondering what level this query is running at. Is it a SQL statement passing through an internal executor, in which case it will benefit from the SQL optimizer and take advantage of secondary indexes on crdb_internal_expiration
, or is it something lower level that hardcodes the query plan?
docs/RFCS/20220120_row_level_ttl.md, line 166 at r8 (raw file):
Previously, otan (Oliver Tan) wrote…
we add a
AND crdb_internal_expiration < now()
clause during the delete. added commentary
Have you been testing with this so far, or was your testing with blind DELETEs? How does this impact the query plan of DELETE transaction? Is it now a multi-stage operation (scan batch, filter, delete) instead of a batch of blind puts? I wouldn't be surprised if this both 1) reduced throughput and 2) forced us to use a smaller default batch size in order to limit contention.
Given this comment and the previous one, I feel like it would be helpful to add some SQL pseudo-code in this RFC somewhere.
docs/RFCS/20220120_row_level_ttl.md, line 265 at r8 (raw file):
Previously, otan (Oliver Tan) wrote…
stuff like REFERENCES ttl_table ON DELETE CASCADE, where for strict TTL we ideally need to apply these delete cascades when the row is expired.
I think I had the language wrong, the FK to a TTL table is complex, but what about one from a TTL table? In other words, what if only the child in an FK relationship has a TTL? Does that introduce any complexity?
5eebfc2
to
b1f1802
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: complete! 0 of 0 LGTMs obtained (and 2 stale) (waiting on @ajstorm, @ajwerner, @knz, @nvanbenschoten, @otan, @rafiss, @RichardJCai, and @vy-ton)
docs/RFCS/20220120_row_level_ttl.md, line 212 at r4 (raw file):
Previously, vy-ton (Vy Ton) wrote…
Hence my pitch for exactly one way to remove all the ttl things: ALTER TABLE tbl RESET (ttl_enabled)
That makes sense, I'm on board
changes made!
docs/RFCS/20220120_row_level_ttl.md, line 159 at r8 (raw file):
Previously, nvanbenschoten (Nathan VanBenschoten) wrote…
I'm just wondering what level this query is running at. Is it a SQL statement passing through an internal executor, in which case it will benefit from the SQL optimizer and take advantage of secondary indexes on
crdb_internal_expiration
, or is it something lower level that hardcodes the query plan?
gotcha! yes it is using the InternalExecutor so it should go through to the optimizer
docs/RFCS/20220120_row_level_ttl.md, line 166 at r8 (raw file):
Previously, nvanbenschoten (Nathan VanBenschoten) wrote…
Have you been testing with this so far, or was your testing with blind DELETEs? How does this impact the query plan of DELETE transaction? Is it now a multi-stage operation (scan batch, filter, delete) instead of a batch of blind puts? I wouldn't be surprised if this both 1) reduced throughput and 2) forced us to use a smaller default batch size in order to limit contention.
Given this comment and the previous one, I feel like it would be helpful to add some SQL pseudo-code in this RFC somewhere.
ah my testing has avoided it ;~; i see how it can affect results now (somehow I assumed we had a Get-And-Delete primitive)
yeah batch sizes may have to be smaller.
docs/RFCS/20220120_row_level_ttl.md, line 265 at r8 (raw file):
Previously, nvanbenschoten (Nathan VanBenschoten) wrote…
I think I had the language wrong, the FK to a TTL table is complex, but what about one from a TTL table? In other words, what if only the child in an FK relationship has a TTL? Does that introduce any complexity?
I don't think so; I think we can live with that. Updated accordingly.
docs/RFCS/20220120_row_level_ttl.md, line 166 at r8 (raw file): Previously, otan (Oliver Tan) wrote…
(i'll experiment after this is all merged in) |
Release note: None
Build failed (retrying...): |
Build succeeded: |
Release note: None