Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

dbinit is getting too order dependent #4286

Closed
smklein opened this issue Oct 17, 2023 · 10 comments · Fixed by #4288
Closed

dbinit is getting too order dependent #4286

smklein opened this issue Oct 17, 2023 · 10 comments · Fixed by #4288
Labels
database Related to database access Update System Replacing old bits with newer, cooler bits

Comments

@smklein
Copy link
Collaborator

smklein commented Oct 17, 2023

Following up from some issues observed by @internet-diglett -- we're observing that the dbinit_equals_sum_of_all_up test is failing more frequently (not flake, just with new PRs) for schema changes which cause constraint changes.

This was referenced by @gjcolombo in https://github.com/oxidecomputer/omicron/tree/main/schema/crdb#adding-tables-and-views , but it's reproducible in some surprising cases.

For example, right now, with main at ee8d93b , if I add the following:

CREATE TABLE IF NOT EXISTS omicron.public.foo (
    id UUID CONSTRAINT pk PRIMARY KEY
);

To both 8.0.0/up.sql, and also to the end of dbinit.sql, I see test failures comparing table constraints.

thread 'integration_tests::schema::dbinit_equals_sum_of_all_up' panicked at 'assertion failed: `(left == right)`'
  left: `"[Row { values: [NamedSqlValue { column: \"constraint_catalog\", value: Some(String(\"omicron\")) }, NamedSqlValue { column: \"constraint_schema\", value: Some(String(\"public\")) }, NamedSqlValue { column: \"c..."` (truncated)
 right: `"[Row { values: [NamedSqlValue { column: \"constraint_catalog\", value: Some(String(\"omicron\")) }, NamedSqlValue { column: \"constraint_schema\", value: Some(String(\"public\")) }, NamedSqlValue { column: \"c..."` (truncated)

Differences (-left|+right):
             NamedSqlValue {
                 column: "constraint_name",
                 value: Some(
                     String(
-                        "4101115737_243_1_not_null",
+                        "4101115737_242_1_not_null",
                     ),
                 ),
             },
             NamedSqlValue {

', nexus/tests/integration_tests/schema.rs:579:9
@smklein smklein added database Related to database access Update System Replacing old bits with newer, cooler bits labels Oct 17, 2023
@internet-diglett
Copy link
Contributor

internet-diglett commented Oct 17, 2023

It seems that there may be some changes in the 7.0.0 migration that causes the expected value generated for the constraint_name to be off by a predictable value. It may be due to a NOT NULL constraint being dropped, or another column being dropped or some combination of statements. Tests pass after moving the migration from 8.0.0 to 6.1.0, and adjusting the placement of the statements in dbinit.sql accordingly.

@david-crespo
Copy link
Contributor

I'm hitting the same problem in #4261 on the latest commit, 5386043.

@internet-diglett
Copy link
Contributor

I'm hitting the same problem in #4261 on the latest commit, 5386043.

THERE ARE TWO OF US!

@smklein
Copy link
Collaborator Author

smklein commented Oct 17, 2023

I really tried avoiding order-dependent anonymous names, with:

CREATE TABLE IF NOT EXISTS omicron.public.foo (
    id UUID CONSTRAINT id_not_null NOT NULL CONSTRAINT foo_pkey PRIMARY KEY
);

But this is still producing an anonymously named "NOT NULL" constraint:

root@pop-os:26257/omicron> SELECT * FROM information_schema.table_constraints where table_name='foo';  
  constraint_catalog | constraint_schema |      constraint_name      | table_catalog | table_schema | table_name | constraint_type | is_deferrable | initially_deferred
---------------------+-------------------+---------------------------+---------------+--------------+------------+-----------------+---------------+---------------------
  omicron            | public            | foo_pkey                  | omicron       | public       | foo        | PRIMARY KEY     | NO            | NO
  omicron            | public            | 4101115737_242_1_not_null | omicron       | public       | foo        | CHECK           | NO            | NO

@smklein
Copy link
Collaborator Author

smklein commented Oct 17, 2023

Admittedly, is_nullable already exists within the information_schema.columns table. We may be able to ignore the specific name of the not_null table constraints, since they are already checked.

@smklein
Copy link
Collaborator Author

smklein commented Oct 17, 2023

Here's my proposal:

When querying the information_schema, we explicitly look for "constraint_name"s where the name ends with "_not_null", and we drop these rows so they don't get compared. These names are internal to CRDB, they're order dependent, and frankly, we don't care what they are.

The comparison of information_schema.columns will already give us coverage on the nullability, so we'd only be redacting information about "CRDB's anonymous names", which we don't care about in the first place.

I'll get a prototype PR up.

@bnaecker
Copy link
Collaborator

It's kind of hard to tell from the docs, but it looks like these anonymous constraints are automatically created to satisfy the primary key constraint. The named PRIMARY KEY constraint shows up in SHOW CONSTRAINTS FROM foo, but the anonymous constraint does not. The docs suggest these are additionally created to satisfy both the uniqueness and the non-nullability of the primary key.

That all corroborates what you're saying, that these can probably be ignored. Another option would be to use SHOW CONSTRAINTS FROM <table_name> instead of querying the information_schema catalog directly, since that doesn't show those anonymous constraints anyway.

@gjcolombo
Copy link
Contributor

I diffed the output of SELECT * from information_schema.table_constraints from a database constructed from dbinit.sql and a database constructed via upgrade from the 1.0.0 schema (in both cases using omicron-dev db-run and db-populate to set up the database and then using the schema-update tool to apply the requisite updates).

ISTM that the names given to NOT NULL constraints are of the form $DATABASE_$TABLE_$COLUMN_not_null; the problem in @smklein's example seems to be that in the upgrade path, the new foo table gets number 243, but in the one-shot dbinit.sql path it gets table number 242. There must be something about the 7.0.0 upgrade that's "retiring" number 242, but I'm currently unsure what it is (my best guess is that it's the upgrade's deletion and recreation of the sled_instance view that's somehow doing this, but again, not at all certain).

I agree with the previous couple comments--we should see if we can come up with checks that depend less on CRDB's internal NOT NULL constraint naming scheme.

@smklein
Copy link
Collaborator Author

smklein commented Oct 17, 2023

That all corroborates what you're saying, that these can probably be ignored. Another option would be to use SHOW CONSTRAINTS FROM <table_name> instead of querying the information_schema catalog directly, since that doesn't show those anonymous constraints anyway.

I'm going to aim for a first draft that modifies our query for information_schema, but I'll look into this approach as a follow-up.

@smklein
Copy link
Collaborator Author

smklein commented Oct 17, 2023

#4288 is ready for review - By parsing information_schema.columns.is_nullable and SHOW CONSTRAINTS FROM <all tables>, I think I have coverage of all constraints in a way that's now order-independent.

smklein added a commit that referenced this issue Oct 18, 2023
This PR does two things:

1. It avoids checking for the order-dependent NOT NULL constraint in
information schema, opting to parse the output of `SHOW CONSTRAINTS` and
[information_schema.columna.is_nullable](https://www.cockroachlabs.com/docs/v23.1/information-schema#columns)
instead.
2. It re-orders the db_metadata creation to the end of dbinit.sql
(mostly for clarity)

Fixes #4286
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
database Related to database access Update System Replacing old bits with newer, cooler bits
Projects
None yet
Development

Successfully merging a pull request may close this issue.

5 participants