Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

changefeedccl: Support interval expressions for cursor #82350

Closed
miretskiy opened this issue Jun 2, 2022 · 3 comments · Fixed by #88058
Closed

changefeedccl: Support interval expressions for cursor #82350

miretskiy opened this issue Jun 2, 2022 · 3 comments · Fixed by #88058
Assignees
Labels
A-cdc Change Data Capture C-enhancement Solution expected to add code/behavior + preserve backward-compat (pg compat issues are exception) sync-me sync-me-5 T-cdc

Comments

@miretskiy
Copy link
Contributor

miretskiy commented Jun 2, 2022

Cursor requires hlc.Timestamp to be specified.
This is very hard to get right. Usually involves querying jobs table to pick up current cursor, or executing now() query and converting it to nanoseconds, etc.
We should support interval expressions so that create changefedd ... with cursor='now() - interval 12h' works.

Jira issue: CRDB-16425

@miretskiy miretskiy added C-enhancement Solution expected to add code/behavior + preserve backward-compat (pg compat issues are exception) A-cdc Change Data Capture T-cdc labels Jun 2, 2022
@blathers-crl
Copy link

blathers-crl bot commented Jun 2, 2022

cc @cockroachdb/cdc

@amruss
Copy link
Contributor

amruss commented Jun 8, 2022

Also applicable for other time related options (end time, etc)

biradarganesh25 added a commit that referenced this issue Sep 16, 2022
The negative timestamp value is calculated as the difference between a
predefined time (this is the time from which we expect changefeed to run)
and the current statement time. knobs is used to get the current statement
time because it will only be available once the change feed statement
starts executing.

Resolves: #82350

Release note: None
@biradarganesh25
Copy link
Contributor

The current acceptable values for cursor are all of the formats mentioned here: https://www.cockroachlabs.com/docs/v22.1/as-of-system-time#parameters. After discussing with @miretskiy , that should be enough. Therefore in the PR, I am just adding additional testing.
To be more specific, for intervals, the user can specify something like "-3h" for cursor value. Valid time units are "us" , "ms", "s", "m", "h".

biradarganesh25 added a commit that referenced this issue Sep 16, 2022
The negative timestamp value is calculated as the difference between a
predefined time (this is the time from which we expect changefeed to run)
and the current statement time. knobs is used to get the current statement
time because it will only be available once the change feed statement
starts executing.

Resolves: #82350

Release note: None
biradarganesh25 added a commit that referenced this issue Sep 18, 2022
The negative timestamp value is calculated as the difference between a
predefined time (this is the time from which we expect changefeed to run)
and the current statement time. knobs is used to get the current statement
time because it will only be available once the change feed statement
starts executing.

Resolves: #82350

Release note: None
craig bot pushed a commit that referenced this issue Sep 19, 2022
87265: descs: unify by-name and by-ID lookups r=postamar a=postamar

This commit unifies the descriptor lookup logic in the descs.Collection
object without changing its behavior. By-name lookups now first
retrieve a descriptor ID and then perform a by-ID lookup. The by-ID
lookup logic is uniquely defined and is responsible for all necessary
validation, hydration and filtering.

This commit also fixes a longstanding hydration bug, in which
uncommitted tables were not properly re-hydrated following a schema
change of a column type in the same transaction.

This commit also changes the return type of GetMutableSchema* methods
from a catalog.SchemaDescriptor to a *schemadesc.Mutable. Previously
these methods would return mutable descriptors on a best-effort basis,
non-physical schema descriptors like temporary or virtual schemas
would remain immutable. Now, an error is returned instead in those
cases.

Release justification: no change to functionality
Release note: None


87957: opt: normalize JSON subscripts `[...]` to fetch value operators `->` r=faizaanmadhani a=faizaanmadhani

Previously, jsonb subscripts were not normalized to fetch value operators. This didn't allow queries with filters like `json_col['a'] = '1'` to scan inverted indexes, resulting in less efficient query plans. With this rule, these types of queries can be index accelerated.

Resolves: #83441

Release note (performance improvement): The optimizer will now plan inverted index scans for queries with JSON subscripting filters, like `json_col['field'] = '"value"`.

87987: storage: handle MVCC range keys in all `NewMVCCIterator` callers r=tbg a=erikgrinaker

**storage: handle inline values in `MVCCIsSpanEmpty`**

`MVCCIsSpanEmpty` uses an `MVCCIncrementalIterator` to handle time
bounds. However, this will error if it encounters any inline values.
This patch instead uses a regular MVCC iterator when time bounds are not
given (the typical case), which correctly handles inline values.

Release note: None

**batcheval: use `MVCCIsSpanEmpty` in `EndTxn`**

In addition to code deduplication, this also respects MVCC range
tombstones.

Release note: None
  
**kvserver: respect MVCC range keys in `optimizePuts`**

If many point writes are submitted, `optimizePuts()` looks for virgin
keyspace and switches to blind writes to amortize seek costs. This did
not check for MVCC range keys, which could lead to faulty conflict
checks and stats updates.

Release note: None
  
**storage: note functions that ignore MVCC range keys**

Release note: None

Touches #87366.

88058: cdc: test negative timestamp cdc r=biradarganesh25 a=biradarganesh25

The negative timestamp value is calculated as the difference between a
predefined time (this is the time from which we expect changefeed to run)
and the current statement time. knobs is used to get the current statement
time because it will only be available once the change feed statement
starts executing.

Resolves: #82350

Release note: None

Co-authored-by: Marius Posta <[email protected]>
Co-authored-by: Faizaan Madhani <[email protected]>
Co-authored-by: Erik Grinaker <[email protected]>
Co-authored-by: Ganeshprasad Rajashekhar Biradar <[email protected]>
@craig craig bot closed this as completed in d87876d Sep 19, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-cdc Change Data Capture C-enhancement Solution expected to add code/behavior + preserve backward-compat (pg compat issues are exception) sync-me sync-me-5 T-cdc
Projects
None yet
Development

Successfully merging a pull request may close this issue.

6 participants