Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

rfc: row level TTL #75189

Merged
merged 1 commit into from
Feb 2, 2022
Merged

rfc: row level TTL #75189

merged 1 commit into from
Feb 2, 2022

Conversation

otan
Copy link
Contributor

@otan otan commented Jan 20, 2022

Release note: None

@cockroach-teamcity
Copy link
Member

This change is Reviewable

@otan
Copy link
Contributor Author

otan commented Jan 20, 2022

cc @ajwerner @dt @nvanbenschoten if you wanted to take a look

@otan otan marked this pull request as ready for review January 20, 2022 02:35
@otan otan requested a review from a team as a code owner January 20, 2022 02:35
Copy link
Contributor

@knz knz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewed 1 of 1 files at r1, all commit messages.
Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @ajstorm, @otan, @rafiss, and @vy-ton)


docs/RFCS/20220120_row_level_ttl.md, line 23 at r1 (raw file):

   id INT PRIMARY KEY,
   text TEXT
) TTL '5 minutes'

Is this a standard PostgreSQL syntax?

If yes, please disregard the rest of my comment.

Otherwise, I'd like to recommend to not add syntax to CREATE, and instead focus on (and promote) the ALTER syntax.
There are a few reasons for this.

  1. the argument after TTL will be (sometimes, like in this example) an interval value. The Interval syntax uses keywords (including TO and ON) which are at risk of becoming ambiguous with other syntax clauses that PosgreSQL allows at the end of CREATE.

  2. we'll likely want to support placeholders and other types of expression for the value parts of the K/V pair syntax presented below, to feed in constants and other computations. However, placeholder / expression support in CREATE is a more difficult endeavor than other statements IIRC.

  3. we know from experience that it's hard in ORMs and other client-side SQL frameworks to add clauses to CREATE statements. However, these are OK with ALTER.

  4. I'm not sure what adding the support in CREATE brings, given that the ALTER can be run immediately after on an empty table. One syntax extension / code path is simpler to implement/maintain than two.


docs/RFCS/20220120_row_level_ttl.md, line 30 at r1 (raw file):

**Note**: this does NOT cover partition level deletes, where deletes can be
*"accelerated" by deleting entire "partitions" at a time.

nit: stray * that breaks the formatting below


docs/RFCS/20220120_row_level_ttl.md, line 86 at r1 (raw file):

   id INT PRIMARY KEY,
   text TEXT,
   crdb_internal_expiration TIMESTAMPTZ NOT NULL DEFAULT current_timestamp() + '5 minutes' ON UPDATE current_timestamp() + '5 minutes'

here and below, can you add the NOT VISIBLE clause to highlight the column is hidden


docs/RFCS/20220120_row_level_ttl.md, line 98 at r1 (raw file):

   text TEXT,
   other_ts_field TIMESTAMPTZ
) TTL (expiration_column = 'other_ts_field')

why the string literal syntax? Can't we use an identifier here?


docs/RFCS/20220120_row_level_ttl.md, line 147 at r1 (raw file):

However, due to the limitation that `ON UPDATE` cannot reference a table
expression, this cannot yet be implemented. This can change when we implemented

nit: when we implement


docs/RFCS/20220120_row_level_ttl.md, line 157 at r1 (raw file):

```sql
ALTER TABLE tbl SET TTL; -- introduce a TTL. no-op if there is already a TTL

I don't understand what the clause does if there's no specification afterwards. What does "introduce a TTL" mean if there's no expiration?


docs/RFCS/20220120_row_level_ttl.md, line 163 at r1 (raw file):

Note initially any options applied to the TTL in regards to the job will not

what is "the job" in this sentence? You've not defined it yet.


docs/RFCS/20220120_row_level_ttl.md, line 185 at r1 (raw file):

job gets created (or modified) by the syntax detailed in the previous section.
The deletes are a SQL-based delete, meaning they will show up on changefeeds
as a regular delete by the root user.

Would it make sense to register another special user for background privilege tasks, member of the admin group?

We'll want to distinguish these activities in telemetry somehow. A different user would do it, perhaps also other attributes.


docs/RFCS/20220120_row_level_ttl.md, line 191 at r1 (raw file):

On each job invocation, the deletion job performs the following:
* Establish a "start time".
* Paginate over all ranges for the table. Allocate these to a "range worker pool".

Would it make sense to tweak the behavior of the job, if there's an index on the TTL column(s)? In that case, there's no need to scan all rows.


docs/RFCS/20220120_row_level_ttl.md, line 200 at r1 (raw file):

    * Issue a SQL DELETE up to delete_batch_size entries from the table using the
      PRIMARY KEYs selected above.
  

Throughout the RFC: I see 3 stray spaces on an empty line sometimes.

@ajstorm ajstorm requested a review from knz January 20, 2022 13:54
Copy link
Collaborator

@ajstorm ajstorm left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great to see this, and nice job putting it together! 🎉

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @knz, @otan, @rafiss, and @vy-ton)


docs/RFCS/20220120_row_level_ttl.md, line 12 at r1 (raw file):

Row-level "time to live" (TTL) is a mechanism in which rows from a table
automatically get deleted once the row surpasses an expiration time (the "TTL").
This has been a [feature commonly asked for](20239).

Nit: this link is broken.


docs/RFCS/20220120_row_level_ttl.md, line 16 at r1 (raw file):

This RFC proposes a CockroachDB level mechanism to support row-level TTL, where
rows will be deleted after a certain period of time. As a further extension in a
later release, rows will automatically be hidden after expiring a TTL.

Nit: for clarity here, does it make sense to change this to "...rows will be automatically hidden after they've expired their TTL and before they've been physically deleted"


docs/RFCS/20220120_row_level_ttl.md, line 23 at r1 (raw file):

Previously, knz (kena) wrote…

Is this a standard PostgreSQL syntax?

If yes, please disregard the rest of my comment.

Otherwise, I'd like to recommend to not add syntax to CREATE, and instead focus on (and promote) the ALTER syntax.
There are a few reasons for this.

  1. the argument after TTL will be (sometimes, like in this example) an interval value. The Interval syntax uses keywords (including TO and ON) which are at risk of becoming ambiguous with other syntax clauses that PosgreSQL allows at the end of CREATE.

  2. we'll likely want to support placeholders and other types of expression for the value parts of the K/V pair syntax presented below, to feed in constants and other computations. However, placeholder / expression support in CREATE is a more difficult endeavor than other statements IIRC.

  3. we know from experience that it's hard in ORMs and other client-side SQL frameworks to add clauses to CREATE statements. However, these are OK with ALTER.

  4. I'm not sure what adding the support in CREATE brings, given that the ALTER can be run immediately after on an empty table. One syntax extension / code path is simpler to implement/maintain than two.

Alternatively, we could go with the standard PG mechanism of using a storage_parameter. Following the PG syntax, this would look like:

CREATE TABLE tbl (
     id INT PRIMARY KEY,
     test TEXT
) WITH (ttl_expiry = '5 minutes')

docs/RFCS/20220120_row_level_ttl.md, line 75 at r1 (raw file):

   id INT PRIMARY KEY,
   text TEXT
) TTL (expiration_time = '5 minutes', delete_batch_size = 1, job_cron = '* * 1 * *')

As mentioned, I think we should go with WITH (ttl_foo = X, ttl_bar = Y, ...)


docs/RFCS/20220120_row_level_ttl.md, line 121 at r1 (raw file):

    // DurationExpr is the automatically assigned interval for when the TTL should apply to a row.
    optional string duration_expr = 1 [(gogoproto.nullable)=false];
    optional string deletion_cron = 2 [(gogoproto.nullable)=false];

What's DeletionCron for?


docs/RFCS/20220120_row_level_ttl.md, line 131 at r1 (raw file):

User override for the crdb_internal_expiration column

Nit: For readability, you might want to move this section lower down where we talk about extensions post-MVP.


docs/RFCS/20220120_row_level_ttl.md, line 160 at r1 (raw file):

ALTER TABLE tbl SET TTL  (expiration_column = 'other_ts_field'); 
-- introduce a TTL with the given option OR apply the given option to the TTL
ALTER TABLE tbl SET TTL (expiration_column = NULL) -- drop an option from the TTL

Does this turn off TTL? If so, perhaps we can use the RESET syntax that PG has for storage parameters instead.


docs/RFCS/20220120_row_level_ttl.md, line 171 at r1 (raw file):

```sql
ALTER TABLE tbl DROP TTL

Nit: RESET?


docs/RFCS/20220120_row_level_ttl.md, line 174 at r1 (raw file):

The DROP will not drop any TTL columns. However, recreating the TTL column will

I thought that we reversed course on this, and now want to drop the column since leaving it around doesn't provide much value.


docs/RFCS/20220120_row_level_ttl.md, line 177 at r1 (raw file):

not succeed as it may risk expiring all the rows immediately. Instead, the user
must either DROP the existing TTL columns or specify it as the `ttl_column`
option. A hint will be provided detailing this. Deletion Process When a TTL

Was this last sentence intentional? Is it the error message/hint? Should it be referring to the column instead of the job?


docs/RFCS/20220120_row_level_ttl.md, line 194 at r1 (raw file):

* Then, for each range worker performs the following:
  * Traverse the range in order of PRIMARY KEY:
    * Issue a SQL SELECT AS OF SYSTEM TIME '-30s' query on the PRIMARY KEYs up to

Nit: Are we going to limit the setting of TTL, say to a minimum of 5 minutes? A very short TTL (5 seconds) would cause problems if the job is only running every 30 seconds.


docs/RFCS/20220120_row_level_ttl.md, line 204 at r1 (raw file):

To ensure the deletion job does not affect foreground traffic, we plan on using
[admission control](admission control) on a SQL transaction at the lowest
possible value (-256).

Nit: To leave room for other jobs in the future which may be even lower priority, perhaps we want to hedge a bit, say by using -200?


docs/RFCS/20220120_row_level_ttl.md, line 232 at r1 (raw file):

Option | Description
--- | ---
select_batch_size | How many rows to fetch from the range that have expired at a given time. Defaults to 500.

If we go with the PG WITH syntax, all of these should be prefaced by ttl_


docs/RFCS/20220120_row_level_ttl.md, line 234 at r1 (raw file):

select_batch_size | How many rows to fetch from the range that have expired at a given time. Defaults to 500.
delete_batch_size | How many rows to delete at a time. Defaults to 100.
range_concurrency | How many concurrent ranges are being worked on at a time. Defaults to 8.

Nit: Should this default to some multiple of core count?


docs/RFCS/20220120_row_level_ttl.md, line 252 at r1 (raw file):

Rows that have expired their TTL can be optionally removed from all SQL query
results. However, this is not planned to be implemented for the first release of
TTL.

Do we have issues open for all of the pieces we're not planning on implementing in the MVP? If so, might be nice to reference them here.


docs/RFCS/20220120_row_level_ttl.md, line 259 at r1 (raw file):

DELETE/ON UPDATE CASCADEs before we can look at allowing this functionality.

## Observability

Might be good to get @maryliag's thoughts on this section.


docs/RFCS/20220120_row_level_ttl.md, line 261 at r1 (raw file):

## Observability
As mentioned above, we expect users will need to tinker with the deletion job to
ensure foreground traffic is not affected.

Was this statement written before the admission control work? I was under the impression that we're no longer expecting users to have to tinker with these jobs.


docs/RFCS/20220120_row_level_ttl.md, line 271 at r1 (raw file):

progress:
* selection and deletion rate
* selection and deletion latency

Any thought as to how these will be exposed? Something like SHOW TTL STATISTICS FOR TABLE foo?


docs/RFCS/20220120_row_level_ttl.md, line 314 at r1 (raw file):

the experience of this is perceived as a little clunky in terms of enabling
or disabling TTL compared to the currently proposed syntax as it is unclear
that the table columns themselves may be modified.

Can you say more about this last part? I'm of the opinion that we should draft off of PG unless the syntax is really unusable. I'm not seeing that here, but it's possible I'm missing something.


docs/RFCS/20220120_row_level_ttl.md, line 324 at r1 (raw file):

  secondary index entry before the delete on the primary index entry. This is
  predicated on filtering out expired rows working.
* Admission control  improvements: In future, we can explore applying

This seems to be outdated.

Copy link
Contributor Author

@otan otan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @ajstorm, @knz, @otan, @rafiss, and @vy-ton)


docs/RFCS/20220120_row_level_ttl.md, line 23 at r1 (raw file):

Previously, ajstorm (Adam Storm) wrote…

Alternatively, we could go with the standard PG mechanism of using a storage_parameter. Following the PG syntax, this would look like:

CREATE TABLE tbl (
     id INT PRIMARY KEY,
     test TEXT
) WITH (ttl_expiry = '5 minutes')

I'm not a fan of the storage parameter syntax, see below.

the ALTER is interesting; i guess, how does this affect not allowing this syntax in CREATE? ORMs can't exactly use SHOW CREATE TABLE, so it just seems like additional syntactical sugar for me to introduce the TTL.


docs/RFCS/20220120_row_level_ttl.md, line 75 at r1 (raw file):

Previously, ajstorm (Adam Storm) wrote…

As mentioned, I think we should go with WITH (ttl_foo = X, ttl_bar = Y, ...)

I disagree, and we should have that discussion in the relevant drawback section below.


docs/RFCS/20220120_row_level_ttl.md, line 86 at r1 (raw file):

Previously, knz (kena) wrote…

here and below, can you add the NOT VISIBLE clause to highlight the column is hidden

my bad, added!


docs/RFCS/20220120_row_level_ttl.md, line 98 at r1 (raw file):

Previously, knz (kena) wrote…

why the string literal syntax? Can't we use an identifier here?

there are many different option types, was hoping to use the postgres-esque (key=value, key=value) syntax here.


docs/RFCS/20220120_row_level_ttl.md, line 121 at r1 (raw file):

Previously, ajstorm (Adam Storm) wrote…

What's DeletionCron for?

Done.


docs/RFCS/20220120_row_level_ttl.md, line 157 at r1 (raw file):

Previously, knz (kena) wrote…

I don't understand what the clause does if there's no specification afterwards. What does "introduce a TTL" mean if there's no expiration?

we give it a default. appended text


docs/RFCS/20220120_row_level_ttl.md, line 160 at r1 (raw file):

Previously, ajstorm (Adam Storm) wrote…

Does this turn off TTL? If so, perhaps we can use the RESET syntax that PG has for storage parameters instead.

Done.


docs/RFCS/20220120_row_level_ttl.md, line 163 at r1 (raw file):

Previously, knz (kena) wrote…

what is "the job" in this sentence? You've not defined it yet.

Done.


docs/RFCS/20220120_row_level_ttl.md, line 171 at r1 (raw file):

Previously, ajstorm (Adam Storm) wrote…

Nit: RESET?

if we do what we have, DROP is more standard.


docs/RFCS/20220120_row_level_ttl.md, line 174 at r1 (raw file):

Previously, ajstorm (Adam Storm) wrote…

I thought that we reversed course on this, and now want to drop the column since leaving it around doesn't provide much value.

Done.


docs/RFCS/20220120_row_level_ttl.md, line 177 at r1 (raw file):

Previously, ajstorm (Adam Storm) wrote…

Was this last sentence intentional? Is it the error message/hint? Should it be referring to the column instead of the job?

Done.


docs/RFCS/20220120_row_level_ttl.md, line 185 at r1 (raw file):

Previously, knz (kena) wrote…

Would it make sense to register another special user for background privilege tasks, member of the admin group?

We'll want to distinguish these activities in telemetry somehow. A different user would do it, perhaps also other attributes.

that's a good idea, but i've added it as a future improvement.


docs/RFCS/20220120_row_level_ttl.md, line 191 at r1 (raw file):

Previously, knz (kena) wrote…

Would it make sense to tweak the behavior of the job, if there's an index on the TTL column(s)? In that case, there's no need to scan all rows.

yes, i'll leave this as a TODO optimisation.


docs/RFCS/20220120_row_level_ttl.md, line 200 at r1 (raw file):

Previously, knz (kena) wrote…

Throughout the RFC: I see 3 stray spaces on an empty line sometimes.

Done.


docs/RFCS/20220120_row_level_ttl.md, line 232 at r1 (raw file):

Previously, ajstorm (Adam Storm) wrote…

If we go with the PG WITH syntax, all of these should be prefaced by ttl_

Done.


docs/RFCS/20220120_row_level_ttl.md, line 252 at r1 (raw file):

Previously, ajstorm (Adam Storm) wrote…

Do we have issues open for all of the pieces we're not planning on implementing in the MVP? If so, might be nice to reference them here.

I can open them afterwards.


docs/RFCS/20220120_row_level_ttl.md, line 261 at r1 (raw file):

Previously, ajstorm (Adam Storm) wrote…

Was this statement written before the admission control work? I was under the impression that we're no longer expecting users to have to tinker with these jobs.

foreground traffic can still be affected, that's why we add monitoring.
modifying "will" to "may".


docs/RFCS/20220120_row_level_ttl.md, line 271 at r1 (raw file):

Previously, ajstorm (Adam Storm) wrote…

Any thought as to how these will be exposed? Something like SHOW TTL STATISTICS FOR TABLE foo?

my thoughts were DB console, added.


docs/RFCS/20220120_row_level_ttl.md, line 314 at r1 (raw file):

Previously, ajstorm (Adam Storm) wrote…

Can you say more about this last part? I'm of the opinion that we should draft off of PG unless the syntax is really unusable. I'm not seeing that here, but it's possible I'm missing something.

I'm of the opinion it goes beyond what the storage_parameter pg syntax was intended for, and the schema changes / behaviour applied warrants it.
I guess I'll let @vy-ton dictate the final decision (not hard to replace all the usages)


docs/RFCS/20220120_row_level_ttl.md, line 324 at r1 (raw file):

Previously, ajstorm (Adam Storm) wrote…

This seems to be outdated.

Done.

@otan otan force-pushed the ttl_rfc branch 2 times, most recently from c8a66c7 to de599bf Compare January 20, 2022 20:49
@otan
Copy link
Contributor Author

otan commented Jan 20, 2022

guess we're going with the new syntax, hehe

@otan otan force-pushed the ttl_rfc branch 4 times, most recently from 5976e8e to d61a163 Compare January 21, 2022 04:01
Copy link
Contributor

@knz knz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks about good to me, with some suggestion to consider using the DEFAULT keyword where appropriate.
(We've done so in the past in other places.)

Reviewed 1 of 1 files at r3, all commit messages.
Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @ajstorm, @otan, @rafiss, and @vy-ton)


docs/RFCS/20220120_row_level_ttl.md, line 23 at r1 (raw file):

Previously, otan (Oliver Tan) wrote…

I'm not a fan of the storage parameter syntax, see below.

the ALTER is interesting; i guess, how does this affect not allowing this syntax in CREATE? ORMs can't exactly use SHOW CREATE TABLE, so it just seems like additional syntactical sugar for me to introduce the TTL.

Yes my point is more that for a large majority of our users, they cannot control the syntax used for CREATE TABLE as much as they can control ALTER (or issue loose ALTER statements through their framework's eec function). Frameworks tend to "own" table creation.


docs/RFCS/20220120_row_level_ttl.md, line 171 at r1 (raw file):

Previously, otan (Oliver Tan) wrote…

if we do what we have, DROP is more standard.

RESET is objectively worse IMHO.

Here's the choice in terms of idiomatic SQL:

  • clause-based things ALTER ... ADD X ... vs ALTER DROP X
  • attribute-based things: WITH x = y vs WITH x = DEFAULT

Judicious use of the DEFAULT keyword also allows multiple assignments inside one statement, some to set custom values and others to reset to defaults.


docs/RFCS/20220120_row_level_ttl.md, line 17 at r3 (raw file):

rows will be deleted after a certain period of time. As a further extension in a
later release, rows rows will be automatically hidden after they've expired
their TTL and before they've been physically deleted

nit: missing period.


docs/RFCS/20220120_row_level_ttl.md, line 166 at r3 (raw file):

The deletes are a SQL-based delete, meaning they will show up on changefeeds
as a regular delete by the root user (we may choose to create a new user for this
in future).

nit: the future


docs/RFCS/20220120_row_level_ttl.md, line 336 at r3 (raw file):

) TTL (expire_after = '5 mins', delete_batch_size = 1, job_cron = '* * 1 * *');
-- adding a new TTL
ALTER TABLE ttl SET TTL;

Can you add a comment that explains what default gets set here?

docs/RFCS/20220120_row_level_ttl.md Outdated Show resolved Hide resolved
docs/RFCS/20220120_row_level_ttl.md Outdated Show resolved Hide resolved
docs/RFCS/20220120_row_level_ttl.md Show resolved Hide resolved
@otan
Copy link
Contributor Author

otan commented Jan 21, 2022


docs/RFCS/20220120_row_level_ttl.md, line 171 at r1 (raw file):

Previously, knz (kena) wrote…

RESET is objectively worse IMHO.

Here's the choice in terms of idiomatic SQL:

  • clause-based things ALTER ... ADD X ... vs ALTER DROP X
  • attribute-based things: WITH x = y vs WITH x = DEFAULT

Judicious use of the DEFAULT keyword also allows multiple assignments inside one statement, some to set custom values and others to reset to defaults.

Reset is the PG syntax for this so I'm hesitant to change it.

Copy link
Contributor Author

@otan otan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @ajstorm, @knz, @otan, @rafiss, @RichardJCai, and @vy-ton)


docs/RFCS/20220120_row_level_ttl.md, line 23 at r1 (raw file):

Previously, knz (kena) wrote…

Yes my point is more that for a large majority of our users, they cannot control the syntax used for CREATE TABLE as much as they can control ALTER (or issue loose ALTER statements through their framework's eec function). Frameworks tend to "own" table creation.

Done.


docs/RFCS/20220120_row_level_ttl.md, line 233 at r3 (raw file):

Previously, RichardJCai (Richard Cai) wrote…

Do we have an idea of how to implement a mechanism to hide expired rows? I'm assuming it'll be implemented in the optimizer?

Yeah,


docs/RFCS/20220120_row_level_ttl.md, line 237 at r3 (raw file):

Previously, RichardJCai (Richard Cai) wrote…

The subheading level here is kind of strange, made me think that this was a subsection of Filtering out Expired Rows

ah yeah i moved this one section lower than it should've been.


docs/RFCS/20220120_row_level_ttl.md, line 260 at r3 (raw file):

Previously, RichardJCai (Richard Cai) wrote…

Are there databases out there that support TTL and foreign keys together?

not really; relational databases offering TTL also basically doesn't exist


docs/RFCS/20220120_row_level_ttl.md, line 336 at r3 (raw file):

Previously, knz (kena) wrote…

Can you add a comment that explains what default gets set here?

Done.

@otan otan force-pushed the ttl_rfc branch 3 times, most recently from 61e729c to d157a1c Compare January 23, 2022 22:21
@otan otan mentioned this pull request Jan 24, 2022
30 tasks
@otan otan force-pushed the ttl_rfc branch 2 times, most recently from e81a080 to 535fbc9 Compare January 24, 2022 06:16
Copy link
Contributor Author

@otan otan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: :shipit: complete! 1 of 0 LGTMs obtained (waiting on @ajwerner, @knz, @nvanbenschoten, @otan, @rafiss, @RichardJCai, and @vy-ton)


docs/RFCS/20220120_row_level_ttl.md, line 159 at r8 (raw file):

Previously, nvanbenschoten (Nathan VanBenschoten) wrote…

Will this pagination use a fixed query plan, or will it pass through the optimizer? Passing through the optimizer has the advantage that it would automatically benefit from a user-specified index (could be hash-sharded) on crdb_internal_expiration if a user wanted to add one.

we re-issue the query with a cursor, if that's what you mean.


docs/RFCS/20220120_row_level_ttl.md, line 166 at r8 (raw file):

Previously, nvanbenschoten (Nathan VanBenschoten) wrote…

How does this avoid racing with concurrent UPDATEs to rows that were at one point expired? Is this DELETE conditional on the crdb_internal_expiration staying the same?

we add a AND crdb_internal_expiration < now() clause during the delete. added commentary


docs/RFCS/20220120_row_level_ttl.md, line 218 at r8 (raw file):

Previously, nvanbenschoten (Nathan VanBenschoten) wrote…

Did we explore a dynamic batch size at any point? Related to my previous comment, you could imagine a scheme where the batch size of low-priotiy DELETEs grows dynamically until it begins hitting contention and retrying, at which point it drops down to a smaller batch size.

nope, i can add that as a future improvement.


docs/RFCS/20220120_row_level_ttl.md, line 265 at r8 (raw file):

Previously, nvanbenschoten (Nathan VanBenschoten) wrote…

I understand why foreign keys from TTL tables are complex, but what makes foreign keys to TTL tables complex? Do we just not think that case is useful without the other?

stuff like REFERENCES ttl_table ON DELETE CASCADE, where for strict TTL we ideally need to apply these delete cascades when the row is expired.

@otan otan force-pushed the ttl_rfc branch 2 times, most recently from 3344bb8 to d815af6 Compare January 30, 2022 13:22
Copy link
Contributor

@ajwerner ajwerner left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm cool with merging this RFC. I have some outstanding desires to decouple the column from the feature a tad and to reduce some of the available flexibility on the automatically created column. My desires are laid out in detail in comments. Otherwise, :lgtm:

I suspect think there's some minor details in the SET/RESET semantics when they're in the same statement to be worked out, but that's not for this document.

Reviewable status: :shipit: complete! 1 of 0 LGTMs obtained (and 1 stale) (waiting on @knz, @nvanbenschoten, @otan, @rafiss, @RichardJCai, and @vy-ton)


docs/RFCS/20220120_row_level_ttl.md, line 212 at r4 (raw file):
Err:

> ALTER TABLE tbl RESET (ttl_automatic_column);
NOTICE: ttl_expire_after has been removed and the crdb_internal_expiration column has been dropped
ALTER TABLE
> SHOW CREATE TABLE tbl;
 CREATE TABLE tbl (
    i INT PRIMARY KEY,
    should_delete BOOL,
 ) WITH (
     ttl_enabled = 'true', 
     ttl_expiration_expression = 'expiration'
 )

docs/RFCS/20220120_row_level_ttl.md, line 147 at r9 (raw file):

### Limitations on the `crdb_internal_expiration` column
* The column can be used in indexes, constraints and primary keys.
* The ON UPDATE and ON DELETE may be overriden, but is rewritten if

I don't like this. I'd much prefer that this column can't be modified. The ttl_expiration_expression gives you all the flexibility you need.

Code quote:

* The ON UPDATE and ON DELETE may be overriden, but is rewritten if
  `ttl_expire_after` is set.

docs/RFCS/20220120_row_level_ttl.md, line 239 at r9 (raw file):

planned to be implemented for the first release of TTL.

## Overriding the default value for the `crdb_internal_expiration` column

I find this framing a bit reductive. ttl_expiration_expression is more general than this framing makes it seem.

Copy link
Contributor

@vy-ton vy-ton left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: :shipit: complete! 1 of 0 LGTMs obtained (and 1 stale) (waiting on @ajwerner, @knz, @nvanbenschoten, @otan, @rafiss, and @RichardJCai)


docs/RFCS/20220120_row_level_ttl.md, line 212 at r4 (raw file):
Some thoughts:

  • For ttl_enabled and ttl_expire_after no concerns
  • It's not clear to me why we need ttl_automatic_column? Doesnt make sense to me that this is another way to remove TTL.
ALTER TABLE tbl RESET (ttl_automatic_column);
NOTICE: ttl_expire_after has been removed and the crdb_internal_expiration column has been dropped
  • Is this also the error if I change the TTL from 30d to 60d with ALTER TABLE tbl SET (ttl_expire_after = '30d')
> ALTER TABLE tbl RESET (ttl_expire_after)
ERROR: cannot remove ttl_expire_after without removing the ttl_automatic_column `

docs/RFCS/20220120_row_level_ttl.md, line 245 at r7 (raw file):

Previously, otan (Oliver Tan) wrote…

@vy-ton me and @ajwerner discussed and came up with this as a proposal for the "override" column, which is really now an "override expression". i think the idea gives us a bunch of flexibility & easy enough for v1, wdyt?

I think this makes sense. Its an expression that has to output a timestamp or NULL?


docs/RFCS/20220120_row_level_ttl.md, line 146 at r9 (raw file):

### Limitations on the `crdb_internal_expiration` column
* The column can be used in indexes, constraints and primary keys.

Should we allow this?

Copy link
Contributor

@ajwerner ajwerner left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: :shipit: complete! 1 of 0 LGTMs obtained (and 1 stale) (waiting on @ajwerner, @knz, @nvanbenschoten, @otan, @rafiss, @RichardJCai, and @vy-ton)


docs/RFCS/20220120_row_level_ttl.md, line 212 at r4 (raw file):

It's not clear to me why we need ttl_automatic_column? Doesn't make sense to me that this is another way to remove TTL.

The idea is to seamlessly support transitioning from usage in example 1 (automatic) to example 2 (user controlled).

Is this also the error if I change the TTL from 30d to 60d with ALTER TABLE tbl SET (ttl_expire_after = '30d')

No, not at all, that's legit. It's saying update the definition of the automatic column to have an ON UPDATE and DEFAULT expression with 30d instead of 60d or whatever.


The principle of what I'm pitching here is that the automatic column is one facet of a larger subsystem we're providing here. We should be able to have the column or not have the column, and everything else should still work. The ttl_expire_after is only relevant in the context of that automatic column. You can update ttl_expire_after and it will update the definition of the expressions in the column. You can drop the column by disabling it.

In the ALTER TABLE tbl RESET (ttl_automatic_column); case, if you've got a ttl_expiration_expression that does not refer to crdb_internal_expiration then it's legit to drop that column and key everything else working as it should. Now, you might think, why don't we just drop the column when the expression stops referencing it? Well, to me, that's too magical. It wouldn't be obvious to me that it's a destructive operation.

I also think it's weird to make the command for dropping the whole subsystem ALTER TABLE tbl RESET (ttl_expire_after); because ttl_expire_after is just one parameter that has to do with the automatic column. Hence my pitch for exactly one way to remove all the ttl things: ALTER TABLE tbl RESET (ttl_enabled).

Copy link
Contributor Author

@otan otan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (and 2 stale) (waiting on @ajwerner, @knz, @nvanbenschoten, @otan, @rafiss, @RichardJCai, and @vy-ton)


docs/RFCS/20220120_row_level_ttl.md, line 245 at r7 (raw file):

Previously, vy-ton (Vy Ton) wrote…

I think this makes sense. Its an expression that has to output a timestamp or NULL?

yep


docs/RFCS/20220120_row_level_ttl.md, line 146 at r9 (raw file):

Previously, vy-ton (Vy Ton) wrote…

Should we allow this?

Yep! But now that I think about it, maybe it doesn't make sense to drop the column if TTL is dropped if it is used in this way.


docs/RFCS/20220120_row_level_ttl.md, line 147 at r9 (raw file):

Previously, ajwerner wrote…

I don't like this. I'd much prefer that this column can't be modified. The ttl_expiration_expression gives you all the flexibility you need.

i can buy that, will leave it to the below discussion regarding syntax to finalise this


docs/RFCS/20220120_row_level_ttl.md, line 239 at r9 (raw file):

Previously, ajwerner wrote…

I find this framing a bit reductive. ttl_expiration_expression is more general than this framing makes it seem.

Done.

Copy link
Contributor

@knz knz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM - that said I'll be curious to see how the discussion around the decoupling mentioned by Andrew resolves.

Reviewed 1 of 1 files at r10, all commit messages.
Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (and 2 stale) (waiting on @ajwerner, @nvanbenschoten, @otan, @rafiss, @RichardJCai, and @vy-ton)

Copy link
Collaborator

@ajstorm ajstorm left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (and 2 stale) (waiting on @ajwerner, @nvanbenschoten, @otan, @rafiss, @RichardJCai, and @vy-ton)


docs/RFCS/20220120_row_level_ttl.md, line 212 at r4 (raw file):

Previously, ajwerner wrote…

It's not clear to me why we need ttl_automatic_column? Doesn't make sense to me that this is another way to remove TTL.

The idea is to seamlessly support transitioning from usage in example 1 (automatic) to example 2 (user controlled).

Is this also the error if I change the TTL from 30d to 60d with ALTER TABLE tbl SET (ttl_expire_after = '30d')

No, not at all, that's legit. It's saying update the definition of the automatic column to have an ON UPDATE and DEFAULT expression with 30d instead of 60d or whatever.


The principle of what I'm pitching here is that the automatic column is one facet of a larger subsystem we're providing here. We should be able to have the column or not have the column, and everything else should still work. The ttl_expire_after is only relevant in the context of that automatic column. You can update ttl_expire_after and it will update the definition of the expressions in the column. You can drop the column by disabling it.

In the ALTER TABLE tbl RESET (ttl_automatic_column); case, if you've got a ttl_expiration_expression that does not refer to crdb_internal_expiration then it's legit to drop that column and key everything else working as it should. Now, you might think, why don't we just drop the column when the expression stops referencing it? Well, to me, that's too magical. It wouldn't be obvious to me that it's a destructive operation.

I also think it's weird to make the command for dropping the whole subsystem ALTER TABLE tbl RESET (ttl_expire_after); because ttl_expire_after is just one parameter that has to do with the automatic column. Hence my pitch for exactly one way to remove all the ttl things: ALTER TABLE tbl RESET (ttl_enabled).

I may be alone here, but I have a mild concern that RESET ttl_enabled will be confused with ttl_pause for users who don't read the docs closely. In that case, they may try to temporarily disable TTL using RESET (ttl_enabled) and only later realize the impact of their decision (that the column was dropped and all historical TTL information lost). What about something like ttl_initialize(d) or ttl_create(d) instead of ttl_enabled? The opposite of initialize/create seems more dramatic than the opposite of enabled.

Copy link
Contributor

@ajwerner ajwerner left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (and 2 stale) (waiting on @ajstorm, @ajwerner, @nvanbenschoten, @otan, @rafiss, @RichardJCai, and @vy-ton)


docs/RFCS/20220120_row_level_ttl.md, line 212 at r4 (raw file):

Previously, ajstorm (Adam Storm) wrote…

I may be alone here, but I have a mild concern that RESET ttl_enabled will be confused with ttl_pause for users who don't read the docs closely. In that case, they may try to temporarily disable TTL using RESET (ttl_enabled) and only later realize the impact of their decision (that the column was dropped and all historical TTL information lost). What about something like ttl_initialize(d) or ttl_create(d) instead of ttl_enabled? The opposite of initialize/create seems more dramatic than the opposite of enabled.

I certainly buy that. I'm far from wedded to that name. How about ttl_in_use? Or even just ttl as the variable and the value as something that a user is unlikely to type like: subsystem initialized. In all of my examples above that variable is implied by the others in the SET clause. This way the removal of all ttl features would be: ALTER TABLE tbl RESET (ttl). WDYT?

@ajstorm ajstorm requested a review from ajwerner January 31, 2022 14:12
Copy link
Collaborator

@ajstorm ajstorm left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (and 2 stale) (waiting on @ajwerner, @nvanbenschoten, @otan, @rafiss, @RichardJCai, and @vy-ton)


docs/RFCS/20220120_row_level_ttl.md, line 212 at r4 (raw file):

Previously, ajwerner wrote…

I certainly buy that. I'm far from wedded to that name. How about ttl_in_use? Or even just ttl as the variable and the value as something that a user is unlikely to type like: subsystem initialized. In all of my examples above that variable is implied by the others in the SET clause. This way the removal of all ttl features would be: ALTER TABLE tbl RESET (ttl). WDYT?

I do like ttl for how it looks with RESET. Explicitly enabling it is a bit strange, but I can live with that.

Copy link
Contributor

@vy-ton vy-ton left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (and 2 stale) (waiting on @ajstorm, @ajwerner, @nvanbenschoten, @otan, @rafiss, and @RichardJCai)


docs/RFCS/20220120_row_level_ttl.md, line 212 at r4 (raw file):

Hence my pitch for exactly one way to remove all the ttl things: ALTER TABLE tbl RESET (ttl_enabled)
That makes sense, I'm on board

Copy link
Member

@nvanbenschoten nvanbenschoten left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (and 2 stale) (waiting on @ajstorm, @ajwerner, @otan, @rafiss, and @RichardJCai)


docs/RFCS/20220120_row_level_ttl.md, line 159 at r8 (raw file):

Previously, otan (Oliver Tan) wrote…

we re-issue the query with a cursor, if that's what you mean.

I'm just wondering what level this query is running at. Is it a SQL statement passing through an internal executor, in which case it will benefit from the SQL optimizer and take advantage of secondary indexes on crdb_internal_expiration, or is it something lower level that hardcodes the query plan?


docs/RFCS/20220120_row_level_ttl.md, line 166 at r8 (raw file):

Previously, otan (Oliver Tan) wrote…

we add a AND crdb_internal_expiration < now() clause during the delete. added commentary

Have you been testing with this so far, or was your testing with blind DELETEs? How does this impact the query plan of DELETE transaction? Is it now a multi-stage operation (scan batch, filter, delete) instead of a batch of blind puts? I wouldn't be surprised if this both 1) reduced throughput and 2) forced us to use a smaller default batch size in order to limit contention.

Given this comment and the previous one, I feel like it would be helpful to add some SQL pseudo-code in this RFC somewhere.


docs/RFCS/20220120_row_level_ttl.md, line 265 at r8 (raw file):

Previously, otan (Oliver Tan) wrote…

stuff like REFERENCES ttl_table ON DELETE CASCADE, where for strict TTL we ideally need to apply these delete cascades when the row is expired.

I think I had the language wrong, the FK to a TTL table is complex, but what about one from a TTL table? In other words, what if only the child in an FK relationship has a TTL? Does that introduce any complexity?

@otan otan force-pushed the ttl_rfc branch 2 times, most recently from 5eebfc2 to b1f1802 Compare February 1, 2022 01:21
Copy link
Contributor Author

@otan otan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (and 2 stale) (waiting on @ajstorm, @ajwerner, @knz, @nvanbenschoten, @otan, @rafiss, @RichardJCai, and @vy-ton)


docs/RFCS/20220120_row_level_ttl.md, line 212 at r4 (raw file):

Previously, vy-ton (Vy Ton) wrote…

Hence my pitch for exactly one way to remove all the ttl things: ALTER TABLE tbl RESET (ttl_enabled)
That makes sense, I'm on board

changes made!


docs/RFCS/20220120_row_level_ttl.md, line 159 at r8 (raw file):

Previously, nvanbenschoten (Nathan VanBenschoten) wrote…

I'm just wondering what level this query is running at. Is it a SQL statement passing through an internal executor, in which case it will benefit from the SQL optimizer and take advantage of secondary indexes on crdb_internal_expiration, or is it something lower level that hardcodes the query plan?

gotcha! yes it is using the InternalExecutor so it should go through to the optimizer


docs/RFCS/20220120_row_level_ttl.md, line 166 at r8 (raw file):

Previously, nvanbenschoten (Nathan VanBenschoten) wrote…

Have you been testing with this so far, or was your testing with blind DELETEs? How does this impact the query plan of DELETE transaction? Is it now a multi-stage operation (scan batch, filter, delete) instead of a batch of blind puts? I wouldn't be surprised if this both 1) reduced throughput and 2) forced us to use a smaller default batch size in order to limit contention.

Given this comment and the previous one, I feel like it would be helpful to add some SQL pseudo-code in this RFC somewhere.

ah my testing has avoided it ;~; i see how it can affect results now (somehow I assumed we had a Get-And-Delete primitive)
yeah batch sizes may have to be smaller.


docs/RFCS/20220120_row_level_ttl.md, line 265 at r8 (raw file):

Previously, nvanbenschoten (Nathan VanBenschoten) wrote…

I think I had the language wrong, the FK to a TTL table is complex, but what about one from a TTL table? In other words, what if only the child in an FK relationship has a TTL? Does that introduce any complexity?

I don't think so; I think we can live with that. Updated accordingly.

@otan
Copy link
Contributor Author

otan commented Feb 1, 2022


docs/RFCS/20220120_row_level_ttl.md, line 166 at r8 (raw file):

Previously, otan (Oliver Tan) wrote…

ah my testing has avoided it ;~; i see how it can affect results now (somehow I assumed we had a Get-And-Delete primitive)
yeah batch sizes may have to be smaller.

(i'll experiment after this is all merged in)

Release note: None
@otan
Copy link
Contributor Author

otan commented Feb 2, 2022

my time has come

bors r+

@craig
Copy link
Contributor

craig bot commented Feb 2, 2022

Build failed (retrying...):

@craig
Copy link
Contributor

craig bot commented Feb 2, 2022

Build succeeded:

@craig craig bot merged commit 7bd7ec0 into cockroachdb:master Feb 2, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

8 participants