-
Notifications
You must be signed in to change notification settings - Fork 2.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Pass in cell to EnsureVSchema and GetOrCreateShard #5930
Pass in cell to EnsureVSchema and GetOrCreateShard #5930
Conversation
6109866
to
29b8e36
Compare
@@ -82,10 +83,21 @@ func (ts *Server) EnsureVSchema(ctx context.Context, keyspace string) error { | |||
} | |||
} | |||
|
|||
err = ts.RebuildSrvVSchema(ctx, []string{} /* cells */) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What if we change EnsureVSchema
to accept a cells list to rebuild, and we have each tablet rebuild only its own cell? Would that still have caused the problem you had?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh, I like that a lot more. It still does the nice thing for single cell setups, with the benefit of being easier to reason about overall. I'll switch to that approach!
That would've been fine. The problem was that the tablet in the init loop wasn't just rebuilding its own cell it was rebuilding all the other cells.
a70acbf
to
97268cb
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maggie this is looking great. I added minor comments.
@@ -54,7 +54,6 @@ func RestoreTablet(t *testing.T, localCluster *cluster.LocalProcessCluster, tabl | |||
_, err := localCluster.VtctlProcess.ExecuteCommandWithOutput("CreateKeyspace", | |||
"-keyspace_type=SNAPSHOT", "-base_keyspace="+keyspaceName, | |||
"-snapshot_time", tm.Format(time.RFC3339), restoreKSName) | |||
require.Nil(t, err) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This seems like it should not be removed right?
go/vt/topo/vschema.go
Outdated
if err != nil { | ||
log.Errorf("could not rebuild SrvVschema after creating keyspace: %v", err) | ||
return err | ||
} | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit - remove extra new line.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM with one optional suggestion.
@@ -1620,7 +1620,12 @@ func commandCreateKeyspace(ctx context.Context, wr *wrangler.Wrangler, subFlags | |||
} | |||
|
|||
if !*allowEmptyVSchema { | |||
err = wr.TopoServer().EnsureVSchema(ctx, keyspace) | |||
cells, err := wr.TopoServer().GetKnownCells(ctx) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we could also just pass an empty cells list to mean "all cells", since RebuildSrvVSchema
already knows how to expand that by doing this same thing.
f927513
to
c2deff9
Compare
Signed-off-by: Maggie Zhou <[email protected]>
Signed-off-by: Maggie Zhou <[email protected]>
Signed-off-by: Maggie Zhou <[email protected]>
Remove empty line. Signed-off-by: Maggie Zhou <[email protected]>
Signed-off-by: Maggie Zhou <[email protected]>
Signed-off-by: Maggie Zhou <[email protected]>
4f5ded2
to
72e75c1
Compare
This test hasn't been flaky. Does it pass for you locally? |
c0f1f4e
to
aadd900
Compare
throw an error Signed-off-by: Maggie Zhou <[email protected]>
aadd900
to
f8edec4
Compare
replicaTabletArgs := commonTabletArg | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@zmagg and I discussed this. It seems like this was a flaky test. The following change addresses that problem.
…ema_no_multicell Pass in cell to EnsureVSchema and GetOrCreateShard
Description
Modifies
EnsureVSchema
so that we always pass in a cell and it rebuilds SrvVSchema only in the provided cell. This removes its auto-rebuild of all cells automatically when you're creating a new shard or keyspace.Background
During a setup of a new region, we had a misconfigured topology and spinning up a new shard caused us to corrupt all cells in the topology, just as part of booting up a new tablet. For safety reasons, it seems better to decouple spinning up a new shard from rebuilding SrvVSchema on all cells. As talked about in person a few weeks ago--we'll keep the rebuild for the simple case where there's only one cell.
Signed-off-by: Maggie Zhou [email protected]