-
Notifications
You must be signed in to change notification settings - Fork 2.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
VTShovel - VReplication support for external databases #5289
Conversation
Signed-off-by: Rafael Chacon <[email protected]>
Signed-off-by: Rafael Chacon <[email protected]>
Signed-off-by: Rafael Chacon <[email protected]>
Signed-off-by: Rafael Chacon <[email protected]>
Signed-off-by: Rafael Chacon <[email protected]>
* Adds support for VStream to start from filename:pos and not gtid sets. * Adds support for statement based replication streams (this should only be used in the context of mysql streamer, it is not safe for tablet vreplicaiton). * Adds support to run vstream from mysql directly Signed-off-by: Rafael Chacon <[email protected]>
* Adds binary to run vtshovel. * At the moment only working in ephemeral mode (i.e no data is persisted back to vrsettings). * vtshovel only works for statement based replication right now. This is due to now having a good way to have a schema loader. We will itereate on this. Signed-off-by: Rafael Chacon <[email protected]>
Signed-off-by: Rafael Chacon <[email protected]>
Signed-off-by: Rafael Chacon <[email protected]>
Signed-off-by: Rafael Chacon <[email protected]>
Signed-off-by: Rafael Chacon <[email protected]>
* This will be removed in future PR. Adding while in POC Signed-off-by: Rafael Chacon <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Approach very nice overall. A few minor nits.
|
||
// NewMySQLVStreamerClient is a vstream client that allows you to stream directly from MySQL. | ||
// In order to achieve this, the following creates a vstreamer Engine with a dummy in memorytopo. | ||
func NewMySQLVStreamerClient(sourceConnParams *mysql.ConnParams) *MySQLVStreamerClient { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm thinking this function can pull the dbconfigs based on the external repl user name. Then you don't have to pass it through to vreplication.Engine.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My thinking here is that the end goal is to be able to point to any external DB, having a this parameter here will make it more flexible.
I think we shouldn't rely heavily in the repl username as we would like to refactor that soon.
What do you think?
Signed-off-by: Rafael Chacon <[email protected]>
Signed-off-by: Rafael Chacon <[email protected]>
Signed-off-by: Rafael Chacon <[email protected]>
Signed-off-by: Rafael Chacon <[email protected]>
* At the moment we only support erpel user. Passing source conn params around was adding unnecessary complexity. * This cleans up that and makes it more explicit that only erepl user is supported. In the future we will add more flexibility in terms of what kind of users can be configured for external vreplication streams Signed-off-by: Rafael Chacon <[email protected]>
* Fix typo in some comments. * Make VReplicator private again. This change is no longer needed. Originally we wanted "vtshovel" to be an external process. Given that this now hooks into the existent engine, there is no need to make this public. Signed-off-by: Rafael Chacon <[email protected]>
Signed-off-by: Rafael Chacon <[email protected]>
Signed-off-by: Rafael Chacon <[email protected]>
Signed-off-by: Rafael Chacon <[email protected]>
Signed-off-by: Rafael Chacon <[email protected]>
Signed-off-by: Rafael Chacon <[email protected]>
* StripChecksum was changing the type of the event. This was a bug. * Adds test to vstreamer to reflect new support for statement based replication Signed-off-by: Rafael Chacon <[email protected]>
Signed-off-by: Rafael Chacon <[email protected]>
Signed-off-by: Rafael Chacon <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looking good. Couple of comments.
proto "github.com/golang/protobuf/proto" | ||
math "math" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is a conflict between grpc code gen and goimports. If you re-run goimports, all these files will revert to unchanged. Or, you can just manually revert these yourself.
* Compute canAcceptStmtEvents when creating vplayer. Signed-off-by: Rafael Chacon <[email protected]>
Hi @jawabuu, the way this code ended up landing, is not a separate binary. It is intended to be run as part of vttablet. The way it works is that is possible to have VReplication streams where the source is external. |
There is some discussion about how this is used in our Slack community: https://vitess.slack.com/archives/C0PQY0PTK/p1604989571062700?thread_ts=1579649445.062400&cid=C0PQY0PTK |
Description
Have you ever wanted to leverage the powers of vreplication outside the environment of Vitess? Do you dream about copying bytes? The following PR will have a solution for you.
Introducing: vtshovel . A flexible tool that allows you to create vreplication streams directly from mysql instances outside of the vitess ecosystem.
To give a bit of context about the motivation for this tool, we (Slack) are in the process of migrating entire databases from our legacy mysql clusters to Vitess. We plan to leverage this tool to help us get in sync mysql instances from our legacy clusters to their Vitess counterparts.
We are thinking that other folks might find useful to have a tool like this when doing migrations.
Core Design
TabletVStreamerClient
and aNewMySQLVStreamerClient
.erepl
to dbconfigs.Additional changes
This PR also adds supports for statement based replication. Binlonplayer can understand both statement and row based replication. Certain types of filters won't be supported in statement based and the stream will fail in such cases. At the moment match all rules will be supported for statement based replication streams.
It also added support to stream without using gtids. This was done cutting some corners, but it will be cleaned up soon. @sougou and I are working on that.