-
Notifications
You must be signed in to change notification settings - Fork 3.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
changefeedccl: error when a watched table backfills #29367
Conversation
@vivekmenezes can you review the second commit, please? It's got some assumptions about leasing and schema change backfills that I want to make sure I got right. Feel free to ignore the first commit, @mrtracy will review that one. |
@mrtracy @vivekmenezes thoughts on this? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewed 2 of 2 files at r1.
Reviewable status: complete! 0 of 0 LGTMs obtained
pkg/ccl/changefeedccl/changefeed_processors.go, line 171 at r2 (raw file):
}(ctx) ca.tableHistUpdaterDoneCh = make(chan struct{}) go func(ctx context.Context) {
Any reason for not using poller.RunTask() ?
pkg/ccl/changefeedccl/changefeed_stmt.go, line 328 at r2 (raw file):
return errors.Errorf(`CHANGEFEEDs cannot operate on tables being backfilled`) } }
it will be cool if you can move the above logic surrounding Mutations to sit in sqlbase/structured.go . I've consciously tried to keep it there so that future refactoring work will be easier.
pkg/ccl/changefeedccl/changefeed_test.go, line 434 at r2 (raw file):
`unique: [2]->{"a": 2, "b": "2"}`, }) })
Really nice job with the testing!
d4585a0
to
7a1ddf2
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: complete! 0 of 0 LGTMs obtained
pkg/ccl/changefeedccl/changefeed_processors.go, line 171 at r2 (raw file):
Previously, vivekmenezes wrote…
Any reason for not using poller.RunTask() ?
Good idea! I hadn't thought of it. Tried it just now, but looks like it's really awkward for distsql to handle an error during Start
, so I think I'll keep this the way it is
pkg/ccl/changefeedccl/changefeed_stmt.go, line 328 at r2 (raw file):
Previously, vivekmenezes wrote…
it will be cool if you can move the above logic surrounding Mutations to sit in sqlbase/structured.go . I've consciously tried to keep it there so that future refactoring work will be easier.
Absolutely! Done
pkg/ccl/changefeedccl/changefeed_test.go, line 434 at r2 (raw file):
Previously, vivekmenezes wrote…
Really nice job with the testing!
It caught TONS of bugs while I was writing the PR. Totally worth the effort
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Commit 1 is with a couple of nits, nice tests
Reviewable status: complete! 0 of 0 LGTMs obtained (and 1 stale)
pkg/ccl/changefeedccl/table_history.go, line 87 at r1 (raw file):
// WaitForTS blocks until the given timestamp is greater or equal to the // high-water or error timestamp. In the latter case, the error is returned.
I think this is the other way around; it waits until the given timestamp is less than or equal to the high-water or error timestamp.
pkg/ccl/changefeedccl/table_history.go, line 168 at r1 (raw file):
} m.mu.Lock()
I think you should move this above the validateErr check, and remove the Lock/Unlock inside of the validateErr check block. This whole area works inside of the mutex.
7a1ddf2
to
9a534ce
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@mrtracy RFAL
Reviewable status: complete! 0 of 0 LGTMs obtained (and 1 stale)
pkg/ccl/changefeedccl/table_history.go, line 87 at r1 (raw file):
Previously, mrtracy (Matt Tracy) wrote…
I think this is the other way around; it waits until the given timestamp is less than or equal to the high-water or error timestamp.
Done.
pkg/ccl/changefeedccl/table_history.go, line 168 at r1 (raw file):
Previously, mrtracy (Matt Tracy) wrote…
I think you should move this above the validateErr check, and remove the Lock/Unlock inside of the validateErr check block. This whole area works inside of the mutex.
I was hoping this way would make it easier to reason about, but sounds like my intuition was wrong. I moved this whole thing into a new method, which I think is even better (one lock followed immediately by a defer at the very start of the new method)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Still LGTM, thanks!
Reviewable status: complete! 0 of 0 LGTMs obtained (and 1 stale)
pkg/ccl/changefeedccl/table_history.go, line 168 at r1 (raw file):
Previously, danhhz (Daniel Harrison) wrote…
I was hoping this way would make it easier to reason about, but sounds like my intuition was wrong. I moved this whole thing into a new method, which I think is even better (one lock followed immediately by a defer at the very start of the new method)
Yes, looks nicer.
pkg/ccl/changefeedccl/table_history.go, line 47 at r3 (raw file):
// The `ValidateThroughTS` method allows a user to block until some given // timestamp is greater (or equal) to either the high-water or the error // timestamp. In the latter case, it returns the error.
Sorry, missed this. Same comment problem that you just fixed below. This one is unexported, but now its inconsistent.
tableHistory tracks that a some invariants hold over a set of tables as time advances. Internally, two timestamps are tracked. The high-water is the highest timestamp such that every version of a TableDescriptor has met a provided invariant (via `validateFn`). An error timestamp is also kept, which is the lowest timestamp where at least one table doesn't meet the invariant. The `ValidateThroughTS` method allows a user to block until some given timestamp is greater (or equal) to either the high-water or the error timestamp. In the latter case, it returns the error. Release note: None
When a table is currently being backfilled for a schema change (e.g. adding a column with a default value), it's unclear what the expectation is for any rows that are changed during the backfill. Our current invariant is that rows are emitted with an updated timestamp and a later SELECT ... AS OF SYSTEM TIME for that row would exactly match the emitted data. During the backfill, there is nothing we can emit that would definitely meet that invariant (because the backfill can be aborted and rolled back). In the meantime, this commit makes sure that we error whenever a backfill happens, even if it's fast enough that we never get it from leasing. This also paves the way for switching to RangeFeed, which doesn't have the convenient `fetchSpansForTargets` hook that the ExportRequest based poller was (ab)using. Closes cockroachdb#28643 Release note (bug fix): CHANGEFEEDs now error when a watched table backfills (instead of undefined behavior)
9a534ce
to
fd1c235
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the reviews!
bors r=mrtracy,vivekmenezes
Reviewable status: complete! 0 of 0 LGTMs obtained (and 1 stale)
pkg/ccl/changefeedccl/table_history.go, line 47 at r3 (raw file):
Previously, mrtracy (Matt Tracy) wrote…
Sorry, missed this. Same comment problem that you just fixed below. This one is unexported, but now its inconsistent.
Done.
29367: changefeedccl: error when a watched table backfills r=mrtracy,vivekmenezes a=danhhz When a table is currently being backfilled for a schema change (e.g. adding a column with a default value), it's unclear what the expectation is for any rows that are changed during the backfill. Our current invariant is that rows are emitted with an updated timestamp and a later SELECT ... AS OF SYSTEM TIME for that row would exactly match the emitted data. During the backfill, there is nothing we can emit that would definitely meet that invariant (because the backfill can be aborted and rolled back). In the meantime, this commit makes sure that we error whenever a backfill happens, even if it's fast enough that we never get it from leasing. This also paves the way for switching to RangeFeed, which doesn't have the convenient `fetchSpansForTargets` hook that the ExportRequest based poller was (ab)using. Closes #28643 Release note (bug fix): CHANGEFEEDs now error when a watched table backfills (instead of undefined behavior) 29427: docs: Fix replace and link in table_ref diagram r=jseldess a=jseldess Needed for cockroachdb/docs#3682. Release note: None 29488: roachtest: add acceptance/decommission r=benesch,tschottdorf a=petermattis Move the decommission acceptance test to a new acceptance/decommission roachtest. Fixes #29151 Release note: None 29538: stats: document stats-related commands as experimental r=RaduBerinde a=RaduBerinde Update the documentation inside `sql.y` to designate the stats-related statements as experimental. Release note: None 29546: roachtest: skip (intentionally) failing Kafka chaos test r=petermattis a=tschottdorf This test has no chance of passing until Kafka chaos is actually supported (see #28636). Touches #29196. Release note: None 29550: testcluster: make manual replication mode disable the merge queue r=petermattis a=benesch TestClusters have a manual replication mode for use in tests that need to precisely control replication on a cluster. Teach that mode to disable the merge queue in addition to the split and replicate queues. This decreases the number of tests that need to directly disable the merge queue. Release note: None 29552: ui: add attributes to login form so LastPass will autofill it r=vilterp a=vilterp LastPass wasn't confident enough to autofill and autologin without these attributes. Fixes #29529 (fixes for LastPass, but maybe not other PW managers) Release note (admin ui change): Add attributes to the login form to allow LastPass to properly recognize it. Co-authored-by: Daniel Harrison <[email protected]> Co-authored-by: Jesse Seldess <[email protected]> Co-authored-by: Peter Mattis <[email protected]> Co-authored-by: Radu Berinde <[email protected]> Co-authored-by: Tobias Schottdorf <[email protected]> Co-authored-by: Nikhil Benesch <[email protected]> Co-authored-by: Pete Vilter <[email protected]>
Build succeeded |
29367: changefeedccl: error when a watched table backfills r=mrtracy,vivekmenezes a=danhhz When a table is currently being backfilled for a schema change (e.g. adding a column with a default value), it's unclear what the expectation is for any rows that are changed during the backfill. Our current invariant is that rows are emitted with an updated timestamp and a later SELECT ... AS OF SYSTEM TIME for that row would exactly match the emitted data. During the backfill, there is nothing we can emit that would definitely meet that invariant (because the backfill can be aborted and rolled back). In the meantime, this commit makes sure that we error whenever a backfill happens, even if it's fast enough that we never get it from leasing. This also paves the way for switching to RangeFeed, which doesn't have the convenient `fetchSpansForTargets` hook that the ExportRequest based poller was (ab)using. Closes #28643 Release note (bug fix): CHANGEFEEDs now error when a watched table backfills (instead of undefined behavior) 29427: docs: Fix replace and link in table_ref diagram r=jseldess a=jseldess Needed for cockroachdb/docs#3682. Release note: None 29488: roachtest: add acceptance/decommission r=benesch,tschottdorf a=petermattis Move the decommission acceptance test to a new acceptance/decommission roachtest. Fixes #29151 Release note: None 29538: stats: document stats-related commands as experimental r=RaduBerinde a=RaduBerinde Update the documentation inside `sql.y` to designate the stats-related statements as experimental. Release note: None 29546: roachtest: skip (intentionally) failing Kafka chaos test r=petermattis a=tschottdorf This test has no chance of passing until Kafka chaos is actually supported (see #28636). Touches #29196. Release note: None 29550: testcluster: make manual replication mode disable the merge queue r=petermattis a=benesch TestClusters have a manual replication mode for use in tests that need to precisely control replication on a cluster. Teach that mode to disable the merge queue in addition to the split and replicate queues. This decreases the number of tests that need to directly disable the merge queue. Release note: None 29552: ui: add attributes to login form so LastPass will autofill it r=vilterp a=vilterp LastPass wasn't confident enough to autofill and autologin without these attributes. Fixes #29529 (fixes for LastPass, but maybe not other PW managers) Release note (admin ui change): Add attributes to the login form to allow LastPass to properly recognize it. Co-authored-by: Daniel Harrison <[email protected]> Co-authored-by: Jesse Seldess <[email protected]> Co-authored-by: Peter Mattis <[email protected]> Co-authored-by: Radu Berinde <[email protected]> Co-authored-by: Tobias Schottdorf <[email protected]> Co-authored-by: Nikhil Benesch <[email protected]> Co-authored-by: Pete Vilter <[email protected]>
When a table is currently being backfilled for a schema change (e.g.
adding a column with a default value), it's unclear what the expectation
is for any rows that are changed during the backfill. Our current
invariant is that rows are emitted with an updated timestamp and a later
SELECT ... AS OF SYSTEM TIME for that row would exactly match the
emitted data. During the backfill, there is nothing we can emit that
would definitely meet that invariant (because the backfill can be
aborted and rolled back).
In the meantime, this commit makes sure that we error whenever a
backfill happens, even if it's fast enough that we never get it from
leasing.
This also paves the way for switching to RangeFeed, which doesn't have
the convenient
fetchSpansForTargets
hook that the ExportRequest basedpoller was (ab)using.
Closes #28643
Release note (bug fix): CHANGEFEEDs now error when a watched table
backfills (instead of undefined behavior)