-
Notifications
You must be signed in to change notification settings - Fork 2.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
VReplication: Undo vschema changes on LookupVindex workflow creation fail #16810
Conversation
Review ChecklistHello reviewers! 👋 Please follow this checklist when reviewing this Pull Request. General
Tests
Documentation
New flags
If a workflow is added or modified:
Backward compatibility
|
Codecov ReportAttention: Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## main #16810 +/- ##
==========================================
- Coverage 69.51% 69.46% -0.06%
==========================================
Files 1569 1571 +2
Lines 202517 203042 +525
==========================================
+ Hits 140780 141033 +253
- Misses 61737 62009 +272 ☔ View full report in Codecov by Sentry. |
Signed-off-by: Matt Lord <[email protected]>
Signed-off-by: Matt Lord <[email protected]>
8381a76
to
53b4a77
Compare
Signed-off-by: Matt Lord <[email protected]>
0842795
to
16bb3a6
Compare
Signed-off-by: Matt Lord <[email protected]>
16bb3a6
to
c99eb2c
Compare
Signed-off-by: Matt Lord <[email protected]>
Signed-off-by: Matt Lord <[email protected]>
Signed-off-by: Matt Lord <[email protected]>
d53652b
to
2b91279
Compare
Signed-off-by: Matt Lord <[email protected]>
2b91279
to
32c29e1
Compare
Signed-off-by: Matt Lord <[email protected]>
0d5ba25
to
5bd4b7e
Compare
if err := mz.deploySchema(); err != nil { | ||
return err | ||
} | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👍
Signed-off-by: Matt Lord <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The changes in this file are not relevant here, but I wanted to add the same new capabilities to the test framework for future use (soon).
go/vt/vtctl/workflow/materializer.go
Outdated
// Table already exists. Let's be sure that it doesn't already have data. | ||
// We exclude multi-tenant migrations from this check as the target tables | ||
// are expected to frequently have data from previously migrated tenants. | ||
if !mz.IsMultiTenantMigration() && td.RowCount > 0 { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
RowCount
is from information_schema.tables
which may not always be updated. We have anecdotally found those stats not getting updated for significant periods of time: showing 0 even after several copy phases have run. On this branch I started a second workflow sharing tables with the first one and it passed this validation, failing with aDuplicate Key
error.
Which was why we were going for the approach in this PR: https://github.com/vitessio/vitess/pull/16826/files#diff-d3a320c7b03791f5d24189e3ae6d7fcac814f4fa1b3d7c02d496b3d8f0adf588R1040-R1043.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, the row count is an estimate. It can remain 0 for some time if you have not inserted/updated enough rows to go beyond the initial 16KiB page in the tablespace. I would guess that in your testing you only had a few very small rows in the table (e.g. corder
). Doing multiple copy phases would not help or matter in this case either. I don't think that this is a very likely scenario in production — since InnoDB updates the stats in information_schema when 1/16th of the table has changed, so now we're talking about 1KiB, and that normally happens pretty quickly on a brand new table that starts out as 16KiB in size — but it's certainly possible (then you'd have to wait for one of the other things that trigger an update).
I can remove the "table has existing data" code here in this PR then if you prefer the other approach. That was not the primary focus in this PR anyway. Sound good?
Thanks!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My tests were for a trivial number of rows, yes. But this was also reported by others recently for larger tables where the copy phase itself took minutes. They were using Progress
and not seeing any changes there without analyze table
... Not sure why the stats update didn't happen in that case.
In any case, yes, let's handle the check for data in that other PR. Rest looks good.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OK, I did that here: 345115e
go/vt/vtctl/workflow/materializer.go
Outdated
// Table already exists. Let's be sure that it doesn't already have data. | ||
// We exclude multi-tenant migrations from this check as the target tables | ||
// are expected to frequently have data from previously migrated tenants. | ||
if !mz.IsMultiTenantMigration() && td.RowCount > 0 { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My tests were for a trivial number of rows, yes. But this was also reported by others recently for larger tables where the copy phase itself took minutes. They were using Progress
and not seeing any changes there without analyze table
... Not sure why the stats update didn't happen in that case.
In any case, yes, let's handle the check for data in that other PR. Rest looks good.
And defer that work to vitessio#16826 Signed-off-by: Matt Lord <[email protected]>
fee40da
to
345115e
Compare
Description
This restores the original vschema on the
LookupVindex create
command's--keyspace
, where the vindex is created, when the workflow creation fails.It also adds a check that when using a matching table that already exists in the
--table-keyspace
, the table does not already have any data.Related Issue(s)
LookupVindex create
does not properly cleanup / undo state changes made when it fails to create the workflow #15984Checklist