Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

CopySchemaShard doesn't work properly with collation and charset defaults set on table #5265

Closed
sverch opened this issue Oct 3, 2019 · 7 comments

Comments

@sverch
Copy link
Contributor

sverch commented Oct 3, 2019

Overview of the Issue

SHOW CREATE TABLE can sometimes return a CREATE TABLE statement that is different from how the table was actually created. In this case, setting the default COLLATION and CHARSET options for the table can result in SHOW CREATE TABLE returning a CREATE TABLE command that explicitly sets the COLLATE option on varchar fields.

The end result is that the vitess CopySchemaShard command will create a table on the destination shard with a different CREATE TABLE statement than it was originally created with, resulting in two slightly different tables. This will cause the CopySchemaShard command to report failure.

Reproduction Steps

In MySQL, on a tablet, create a table like this:

CREATE TABLE test (
 key_myKeyStringField varchar(750) DEFAULT NULL
)ENGINE=InnoDB DEFAULT CHARSET=utf8mb4 COLLATE=utf8mb4_bin

Show create table will display:

key_myKeyStringField varchar(750)  COLLATE utf8mb4_bin DEFAULT NULL

If you create the table using:

CREATE TABLE test (
 key_myKeyStringField varchar(750) COLLATE utf8mb4_bin DEFAULT NULL
)

or

CREATE TABLE test (
 key_myKeyStringField varchar(750) CHARACTER SET utf8mb4 COLLATE utf8mb4_bin DEFAULT NULL
)

Show create table will display:

CREATE TABLE test (
 key_myKeyStringField varchar(750) CHARACTER SET utf8mb4 COLLATE utf8mb4_bin DEFAULT NULL
)

If you run CopySchemaShard with a source of the table created with the first CREATE TABLE statement above, it will create a table using the second CREATE TABLE statement, causing the two tables to be different and the command to fail.

It's possible that there are some cases where if the connection's charset/collation differ to the table, SHOW CREATE TABLE may report back something that is more verbose, so the results of this reproduction may vary.

@sougou
Copy link
Contributor

sougou commented Oct 5, 2019

The comment on the compare schema explicitly states that we expect that the source and targets have the same default character set settings. Trying to make it smarter is essentially an endless complication because mysql has default settings on every schema entity and they layer on top of each other.

The only reasonable fix for the problem is to add an option to CopySchemaShard to make it not fail the comparison. This means that we leave it to the user to determine that the schemas match.

We should still compare and report the differences. That will tell the users that they need to verify this mismatch.

@morgo morgo added this to the v5.0 milestone Oct 25, 2019
@morgo morgo modified the milestones: v5.0, v6.0 Feb 4, 2020
@aquarapid
Copy link
Contributor

There's a similar issue for the "CREATE DATABASE" step of CopySchemaShard, where things might differ slightly because of differences in MySQL versions. Note that we do that as a separate step, so a "force" flag would have to bypass a failure on that step too.

@aquarapid
Copy link
Contributor

I am adding a patch to add a --skip-verify option to CopySchemaShard, that will do the copy as usual, but skip the post-copy verify. Since this is an opt-in option, I'm not displaying the differences in a case like that.

aquarapid added a commit to planetscale/vitess that referenced this issue Mar 25, 2020
@sverch
Copy link
Contributor Author

sverch commented Mar 26, 2020

@sougou I was rereading your comment above and I actually had a question, particularly about this:

Trying to make it smarter is essentially an endless complication because mysql has default settings on every schema entity and they layer on top of each other.

If someone sets a table default, would that override any layers before that? Would it be possible to at least unambiguously handle some cases automatically that way, or do you need to know the full picture to make any reasonable assertion?

@sougou
Copy link
Contributor

sougou commented Mar 26, 2020

That may work and could be worthy workaround.

sougou added a commit that referenced this issue Apr 13, 2020
Issue #5265 ; add a --skip-verify option to CopySchemaShard.
@morgo morgo removed this from the v6.0 milestone Apr 27, 2020
@morgo
Copy link
Contributor

morgo commented Apr 27, 2020

A workaround has been merged, so I am removing this from the v6.0 milestone.

@mattlord
Copy link
Contributor

I'm closing this as it's no longer an issue. If you feel that I'm missing something here please let me know and we can re-open it. Thanks!

systay pushed a commit that referenced this issue Jul 22, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

6 participants