-
-
Notifications
You must be signed in to change notification settings - Fork 1
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Cleanly support PK and FK remapping when importing #171
Comments
azaslavsky
added a commit
to getsentry/sentry
that referenced
this issue
Aug 3, 2023
Prior to this change, model uniqueness was established using the concatenation of model name ("sentry.somemodel") and the pk that particular instance contains in the database. Unfortunately, this is not sufficient for comparing two JSON dumps for equality, as the pks may change depending on which underlying validation database is used. This change moves us to a model where, at ingestion time, each model being compared is assigned a per-model-kind ordinal, such that its unique identifier is roughly "this is the 3rd lowest pk of model `sentry.somemodel`". To help this transition along, we simplify some of the equality checks, reducing them to `UnorderedInput` and `UnequalCounts` (there are more instance of a certain model kind in one side than the other). Issue: getsentry/team-ospo#171
azaslavsky
added a commit
to getsentry/sentry
that referenced
this issue
Aug 7, 2023
Prior to this change, model uniqueness was established using the concatenation of model name ("sentry.somemodel") and the pk that particular instance contains in the database. Unfortunately, this is not sufficient for comparing two JSON dumps for equality, as the pks may change depending on which underlying validation database is used. This change moves us to a model where, at ingestion time, each model being compared is assigned a per-model-kind ordinal, such that its unique identifier is roughly "this is the 3rd lowest pk of model `sentry.somemodel`". To help this transition along, we simplify some of the equality checks, reducing them to `UnorderedInput` and `UnequalCounts` (there are more instance of a certain model kind in one side than the other). Issue: getsentry/team-ospo#171
This was referenced Aug 7, 2023
azaslavsky
added a commit
to getsentry/sentry
that referenced
this issue
Aug 8, 2023
Prior to this change, model uniqueness was established using the concatenation of model name ("sentry.somemodel") and the pk that particular instance contains in the database. Unfortunately, this is not sufficient for comparing two JSON dumps for equality, as the pks may change depending on which underlying validation database is used. This change moves us to a model where, at ingestion time, each model being compared is assigned a per-model-kind ordinal, such that its unique identifier is roughly "this is the 3rd lowest pk of model `sentry.somemodel`". To help this transition along, we simplify some of the equality checks, reducing them to `UnorderedInput` and `UnequalCounts` (there are more instance of a certain model kind in one side than the other). Issue: getsentry/team-ospo#171
spalmurray
pushed a commit
to getsentry/sentry
that referenced
this issue
Aug 8, 2023
Prior to this change, model uniqueness was established using the concatenation of model name ("sentry.somemodel") and the pk that particular instance contains in the database. Unfortunately, this is not sufficient for comparing two JSON dumps for equality, as the pks may change depending on which underlying validation database is used. This change moves us to a model where, at ingestion time, each model being compared is assigned a per-model-kind ordinal, such that its unique identifier is roughly "this is the 3rd lowest pk of model `sentry.somemodel`". To help this transition along, we simplify some of the equality checks, reducing them to `UnorderedInput` and `UnequalCounts` (there are more instance of a certain model kind in one side than the other). Issue: getsentry/team-ospo#171
azaslavsky
added a commit
to getsentry/sentry
that referenced
this issue
Aug 9, 2023
There are a number of properties of the export system API that we'd like to eventually change. These include: - Barring users from manually excluding/including specific models on export - Only exporting `sentry...` models To keep the original behavior export system intact until we debut these changes, this commit hides the behind `OldExportConfig`, which allows them to be toggled on (for tests) but left off for the actual CLI tool (for now). Issue: getsentry/team-ospo#171 Issue: getsentry/team-ospo#184
azaslavsky
added a commit
to getsentry/sentry
that referenced
this issue
Aug 9, 2023
We will re-use this functionality in multiple places, so its best to move it now. Issue: getsentry/team-ospo#171
azaslavsky
added a commit
to getsentry/sentry
that referenced
this issue
Aug 9, 2023
Prior to this change, the `sort_dependencies` function was the only public dependency calculation that the import/export system used, merely returning a list of models in dependency order. This change splits part of that functionality out, exposing an intermediate state where we can see all of the model relations for a given model, including which fields they occupy on the main model. This will be useful when we add "foreign-key remapping" capabilities on import. To facilitate this, we have added a couple of fixtures that capture the state of the model dependency graph in source control, in both detailed and flattened form. These serve a few purposes: 1. It maintains a record in source control of model dependency graph changes. 2. It notifies team-ospo of any such changes, allowing us to ensure that they don't break import/export functionality. 3. It serves as a simple golden test for the dependency resolution mechanism. Issue: getsentry/team-ospo#171
azaslavsky
added a commit
to getsentry/sentry
that referenced
this issue
Aug 9, 2023
There are a number of properties of the import system API that we'd like to eventually change. These include: - Using `INSERT` rather than `INSERT_OR_UPDATE` semantics for import operations. - Using natural foreign keys (currently only relevant to the `sentry.User` model). To keep the original behavior export system intact until we debut these changes, this commit hides the behind `OldImportConfig`, which allows them to be toggled on (for tests) but left off for the actual CLI tool (for now). Issue: getsentry/team-ospo#171 Issue: getsentry/team-ospo#184
azaslavsky
added a commit
to getsentry/sentry
that referenced
this issue
Aug 9, 2023
The `sorted_dependencies` function has, since time immemorial, tried to track all `ManyToManyField` dependencies without a `through` model set, since there is otherwise no way to deduce the dependency relationship for the shadow junction tables Django creates under the hood. Except... we actually don't have any `ManyToManyField` definitions without an explicit `through` argument. Maybe we did when this code was written, but today we no longer do. The removed bits of code were thus never really executed, since they were checking for a state of affairs that wasn't really a problem in our actual code base. They were also buggy: the comment said they were checking for `through`-ness, but there was no code to this effect, resulting in double dependencies. This change removes the check altogether, and adds an invariant test to ensure that all `ManyToManyField`s defined in the future carry a `through` argument. Issue: getsentry/team-ospo#171
azaslavsky
added a commit
to getsentry/sentry
that referenced
this issue
Aug 10, 2023
Prior to this change, the `sort_dependencies` function was the only public dependency calculation that the import/export system used, merely returning a list of models in dependency order. This change splits part of that functionality out, exposing an intermediate state where we can see all of the model relations for a given model, including which fields they occupy on the main model. This will be useful when we add "foreign-key remapping" capabilities on import. To facilitate this, we have added a couple of fixtures that capture the state of the model dependency graph in source control, in both detailed and flattened form. These serve a few purposes: 1. It maintains a record in source control of model dependency graph changes. 2. It notifies team-ospo of any such changes, allowing us to ensure that they don't break import/export functionality. 3. It serves as a simple golden test for the dependency resolution mechanism. Issue: getsentry/team-ospo#171
azaslavsky
added a commit
to getsentry/sentry
that referenced
this issue
Aug 10, 2023
Prior to this change, the `sort_dependencies` function was the only public dependency calculation that the import/export system used, merely returning a list of models in dependency order. This change splits part of that functionality out, exposing an intermediate state where we can see all of the model relations for a given model, including which fields they occupy on the main model. This will be useful when we add "foreign-key remapping" capabilities on import. To facilitate this, we have added a couple of fixtures that capture the state of the model dependency graph in source control, in both detailed and flattened form. These serve a few purposes: 1. It maintains a record in source control of model dependency graph changes. 2. It notifies team-ospo of any such changes, allowing us to ensure that they don't break import/export functionality. 3. It serves as a simple golden test for the dependency resolution mechanism. Issue: getsentry/team-ospo#171
azaslavsky
added a commit
to getsentry/sentry
that referenced
this issue
Aug 10, 2023
There are a number of properties of the import system API that we'd like to eventually change. These include: - Using `INSERT` rather than `INSERT_OR_UPDATE` semantics for import operations. - Using natural foreign keys (currently only relevant to the `sentry.User` model). To keep the original behavior export system intact until we debut these changes, this commit hides the behind `OldImportConfig`, which allows them to be toggled on (for tests) but left off for the actual CLI tool (for now). Issue: getsentry/team-ospo#171 Issue: getsentry/team-ospo#184
azaslavsky
added a commit
to getsentry/sentry
that referenced
this issue
Aug 10, 2023
The `sorted_dependencies` function has, since time immemorial, tried to track all `ManyToManyField` dependencies without a `through` model set, since there is otherwise no way to deduce the dependency relationship for the shadow junction tables Django creates under the hood. Except... we actually don't have any `ManyToManyField` definitions without an explicit `through` argument. Maybe we did when this code was written, but today we no longer do. The removed bits of code were thus never really executed, since they were checking for a state of affairs that wasn't really a problem in our actual code base. They were also buggy: the comment said they were checking for `through`-ness, but there was no code to this effect, resulting in double dependencies. This change removes the check altogether, and adds an invariant test to ensure that all `ManyToManyField`s defined in the future carry a `through` argument. Issue: getsentry/team-ospo#171
azaslavsky
added a commit
to getsentry/sentry
that referenced
this issue
Aug 11, 2023
There are a number of properties of the import system API that we'd like to eventually change. These include: - Using `INSERT` rather than `INSERT_OR_UPDATE` semantics for import operations. - Using natural foreign keys (currently only relevant to the `sentry.User` model). To keep the original behavior export system intact until we debut these changes, this commit hides the behind `OldImportConfig`, which allows them to be toggled on (for tests) but left off for the actual CLI tool (for now). Issue: getsentry/team-ospo#171 Issue: getsentry/team-ospo#184
azaslavsky
added a commit
to getsentry/sentry
that referenced
this issue
Aug 11, 2023
The `sorted_dependencies` function has, since time immemorial, tried to track all `ManyToManyField` dependencies without a `through` model set, since there is otherwise no way to deduce the dependency relationship for the shadow junction tables Django creates under the hood. Except... we actually don't have any `ManyToManyField` definitions without an explicit `through` argument. Maybe we did when this code was written, but today we no longer do. The removed bits of code were thus never really executed, since they were checking for a state of affairs that wasn't really a problem in our actual code base. They were also buggy: the comment said they were checking for `through`-ness, but there was no code to this effect, resulting in double dependencies. This change removes the check altogether, and adds an invariant test to ensure that all `ManyToManyField`s defined in the future carry a `through` argument. Issue: getsentry/team-ospo#171
azaslavsky
added a commit
to getsentry/sentry
that referenced
this issue
Aug 14, 2023
Given two instances of a model that are being validated for equality, we want the foreign keys to be correct relatively (ie, they point to the same model in the respective models' JSON blobs), not absolutely (ie, they are literally the same integer). By creating maps that store the relations between pks and ordinals, we can easily check that the models point to match their respective ordinals regardless of the actual pk numbers. Issue: getsentry/team-ospo#171
azaslavsky
added a commit
to getsentry/sentry
that referenced
this issue
Aug 15, 2023
Given two instances of a model that are being validated for equality, we want the foreign keys to be correct relatively (ie, they point to the same model in the respective models' JSON blobs), not absolutely (ie, they are literally the same integer). By creating maps that store the relations between pks and ordinals, we can easily check that the models point to match their respective ordinals regardless of the actual pk numbers. Issue: getsentry/team-ospo#171
azaslavsky
added a commit
to getsentry/sentry
that referenced
this issue
Aug 15, 2023
When importing into a database that is not entirely flushed, sequences and all, we run into a problem: the foriegn keys in the source JSON file will not match the primary keys of the newly `INSERT`ed models we will be importing. This will cause downstream imports that depend on upstream imports to fail. To resolve this, we maintain a `PrimaryKeyMap` of old pks to new pks for every model. As we import models, we take note of their new pks, so that when foreign key references to these models are encountered later on, we can perform a simple replacement. This generally works well enough, but because we have a circular dependency between `Actor` and `Team`, we must take care to do the appropriate set of dance moves to avoid writing `Actor`s with (necessarily) non-existent `Team` references. To test these changes, I've modified all of `test_models` to *not* reset sequences between database uploads. This should ensure that every such test will produce two JSON files with differing pks, which should give us fairly thorough coverage. Issue: getsentry/team-ospo#170 Issue: getsentry/team-ospo#171
azaslavsky
added a commit
to getsentry/sentry
that referenced
this issue
Aug 15, 2023
When importing into a database that is not entirely flushed, sequences and all, we run into a problem: the foriegn keys in the source JSON file will not match the primary keys of the newly `INSERT`ed models we will be importing. This will cause downstream imports that depend on upstream imports to fail. To resolve this, we maintain a `PrimaryKeyMap` of old pks to new pks for every model. As we import models, we take note of their new pks, so that when foreign key references to these models are encountered later on, we can perform a simple replacement. This generally works well enough, but because we have a circular dependency between `Actor` and `Team`, we must take care to do the appropriate set of dance moves to avoid writing `Actor`s with (necessarily) non-existent `Team` references. To test these changes, I've modified all of `test_models` to *not* reset sequences between database uploads. This should ensure that every such test will produce two JSON files with differing pks, which should give us fairly thorough coverage. Issue: getsentry/team-ospo#170 Issue: getsentry/team-ospo#171
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
No description provided.
The text was updated successfully, but these errors were encountered: