Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

sql: Inconsistent zone config behavior when dropping database or table #24179

Closed
a-robinson opened this issue Mar 23, 2018 · 1 comment
Closed
Assignees
Labels
A-schema-changes C-bug Code not up to spec/doc, specs & docs deemed correct. Solution expected to change code/behavior.
Milestone

Comments

@a-robinson
Copy link
Contributor

As "discovered" in #24154, we handle zone configs differently when dropping partitions/tables/databases.

  1. When you drop a table, we keep the corresponding zone config around until the table's descriptor is deleted, which is supposed to happen after the configured GC TTL period.
  2. I haven't tested dropping a partition, so I won't speak on it.
  3. When you drop a database, we delete the corresponding zone config immediately.

The behavior when dropping a database is concerning. If (a) there's data in a table in the database (b) the table doesn't have a table-level zone config (c) the database does have a database-level zone config, then dropping the database (e.g. DROP DATABASE ... CASCADE) has the effect of changing the zone config that applies to the data in the table. Before dropping the database, the databases's zone config applies to the table. After dropping the database, the global .default zone config is all that applies to the table.

This means that when we drop a database, it can actually cause a lot of rebalancing if the database had different constraints on it than the .default zone. Even if the database was constrained to certain regions for sovereignty reasons, the yet-to-be-GC'ed data will likely get moved elsewhere.

For example, I created a 6 node local cluster with 3 nodes in locality a=1 and 3 in a=2. I created a table, split it 20 times, and constrained all its replicas to a=1 around 16:07 in the below screenshot. At 16:09 I ran DROP DATABASE ... CASCADE on the database, and suddenly all the data got spread around across both localities because the constraint I had specified no longer existed.

screen shot 2018-03-23 at 12 13 46 pm

There may also be a similar problem with us deleting a table's zone config before all the data has been GC'ed. I'm not an expert in that area, though, so I'll let @benesch speak to that.

@a-robinson a-robinson added the C-bug Code not up to spec/doc, specs & docs deemed correct. Solution expected to change code/behavior. label Mar 23, 2018
@a-robinson a-robinson added this to the 2.1 milestone Mar 23, 2018
@knz knz changed the title config/sql: Inconsistent zone config behavior when dropping database or table sql: Inconsistent zone config behavior when dropping database or table May 10, 2018
@vivekmenezes vivekmenezes added A-schema-changes and removed A-schema-descriptors Relating to SQL table/db descriptor handling. A-bulkio-schema-changes labels Jul 24, 2018
@vivekmenezes vivekmenezes removed this from the 2.1 milestone Jul 24, 2018
@knz knz added this to the 2.2 milestone Sep 18, 2018
@vivekmenezes
Copy link
Contributor

to be fixed after #19004 is fixed. Once a job exists for the DROP DATABASE schema change, the job can be used to drop the zone config later.

craig bot pushed a commit that referenced this issue Oct 11, 2018
30666: sql: delete db zone config after GC TTL r=eriktrinh a=eriktrinh

Previously the database's zone config would be deleted immediately when
a DROP DATABASE is initiated, including if there are tables that the
drop cascades down to. This causes dropped tables waiting for the GC
period without a table zone config to have values from the default
config to be applied (which can cause unnecessary data movement).

This change removes the database's zone config once all the cascaded
table data is completely removed, using the dropped table status in the
job details.

Fixes #24179.

Co-authored-by: Erik Trinh <[email protected]>
@craig craig bot closed this as completed in #30666 Oct 11, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-schema-changes C-bug Code not up to spec/doc, specs & docs deemed correct. Solution expected to change code/behavior.
Projects
None yet
Development

No branches or pull requests

5 participants