-
Notifications
You must be signed in to change notification settings - Fork 3.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
zonepb: make subzone DiffWithZone more accurate #62839
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: complete! 0 of 0 LGTMs obtained (waiting on @adityamaru and @ajstorm)
pkg/ccl/logictestccl/testdata/logic_test/multi_region_zone_configs, line 323 at r2 (raw file):
ALTER table regional_by_row CONFIGURE ZONE USING gc.ttlseconds = 10 statement error zone configuration for table regional_by_row contains incorrectly configured field "num_replicas"
oops this is wrong, one second
ac77025
to
f47e5c3
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewed 1 of 4 files at r2, 5 of 5 files at r3.
Reviewable status: complete! 0 of 0 LGTMs obtained (waiting on @adityamaru and @otan)
pkg/ccl/logictestccl/testdata/logic_test/multi_region_zone_configs, line 323 at r2 (raw file):
Previously, otan (Oliver Tan) wrote…
oops this is wrong, one second
?
pkg/ccl/logictestccl/testdata/logic_test/multi_region_zone_configs, line 468 at r3 (raw file):
SET override_multi_region_zone_config = false statement error attempting to update zone config which is missing an expected zone configuration for partition "ap-southeast-2" of regional_by_row@idx
Nit: Worth calling crdb_internal.validate_multi_region_zone_configs()
before this call?
pkg/config/zonepb/zone.go, line 774 at r3 (raw file):
if !found { // Zone configs could be extraneously defined as long as it does not // differ between fieldLists.
Nit: I found this comment a bit confusing. Do we mean that there can be an extra zone config defined so long as it doesn't have any fields in the fieldList set?
pkg/config/zonepb/zone.go, line 811 at r3 (raw file):
} // Anything remaining in the map can be presumed to be missing.
Nit: ...provided that there's something set that matches a field in the fieldList?
pkg/config/zonepb/zone.go, line 826 at r3 (raw file):
} } return true, DiffWithZoneMismatch{}, nil
Wow. Nice catch! The zone config possibilities are pretty intense.
pkg/sql/region_util.go, line 456 at r3 (raw file):
func isPlaceholderZoneConfigForMultiRegion(zc zonepb.ZoneConfig) bool { // Placeholders must have at least 1 subzone. if len(zc.Subzones) < 1 {
Nit: Should this be == 0
, to be a bit more clear?
pkg/sql/region_util.go, line 570 at r3 (raw file):
} // Mark the NumReplicas as 0 if we have a placeholder.
Nit: I actually like the old comment better. If you really prefer the new comment, can you add in a reference to "See zonepb.IsSubzonePlaceholder for why this is necessary." I personally find this whole placeholder concept tricky, and I suspect others will too.
pkg/sql/region_util.go, line 1349 at r3 (raw file):
// Remove inactive subzones from the comparison. filteredSubzones := currentZoneConfig.Subzones[:0]
Nit: Did you add this out of an abundance of caution? Or did you hit a case where there were inactive indexes with defined zone configs? I'd think we'd want to root out those cases in addition to handling them here.
pkg/sql/region_util.go, line 1363 at r3 (raw file):
// Mark the expected NumReplicas as 0 if we have a placeholder // and the current zone config is also a placeholder. // The latter check is requires as in cases where non-multiregion fields
Nit: required
pkg/sql/region_util.go, line 1364 at r3 (raw file):
// and the current zone config is also a placeholder. // The latter check is requires as in cases where non-multiregion fields // is set on the current zone config, the expected zone config has needs
Nit: is -> are, also there's an extra "has"
pkg/sql/region_util.go, line 1367 at r3 (raw file):
// the placeholder marked so that DiffWithZone does not error when // num_replicas is expectedly different. if currentZoneConfig.IsSubzonePlaceholder() && isPlaceholderZoneConfigForMultiRegion(expectedZoneConfig) {
I don't understand this check. If expected is a placeholder, won't its NumReplicas already be set to 0?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: complete! 0 of 0 LGTMs obtained (waiting on @adityamaru and @ajstorm)
pkg/ccl/logictestccl/testdata/logic_test/multi_region_zone_configs, line 323 at r2 (raw file):
Previously, ajstorm (Adam Storm) wrote…
?
earlier revision had the wrong result a few lines up.
pkg/config/zonepb/zone.go, line 774 at r3 (raw file):
there can be an extra zone config defined so long as it doesn't have any fields in the fieldList set
much cleaner. comment change.
pkg/config/zonepb/zone.go, line 826 at r3 (raw file):
Previously, ajstorm (Adam Storm) wrote…
Wow. Nice catch! The zone config possibilities are pretty intense.
pkg/sql/region_util.go, line 1349 at r3 (raw file):
Previously, ajstorm (Adam Storm) wrote…
Nit: Did you add this out of an abundance of caution? Or did you hit a case where there were inactive indexes with defined zone configs? I'd think we'd want to root out those cases in addition to handling them here.
There were indexes that are being cleaned up during, e.g. transitioning from RBR -> Global which appears to not get cleaned up immediately, in which the zone config hangs around for a while. If these were a concern, I'd say delete this filtering and you'll see a few tests fail, e.g. go test ./pkg/ccl/telemetryccl -run 'TestTelemetry/multiregion/server'
pkg/sql/region_util.go, line 1367 at r3 (raw file):
Previously, ajstorm (Adam Storm) wrote…
I don't understand this check. If expected is a placeholder, won't its NumReplicas already be set to 0?
nope, it is nil. we have to set it by ourselves.
BUT if current zone config is not a placeholder because e.g. gc.ttlseconds is set, then we fudge expected zone config to be a placeholder.
* Subzones may be defined in a different order. We did not take this into account which can cause bugs when e.g. ADD REGION adds a subzone in the end rather than in the old "expected" location in the subzones array. This has been fixed by comparing subzones using an unordered map. * The ApplyZoneConfig we previously did overwrote subzone fields on the original subzone array element, meaning that if there was a mismatch it would not be reported through validation. This is now fixed by applying the expected zone config to *zonepb.NewZoneConfig() instead. * Added logic to only check for zone config matches subzones from active subzone IDs. * Improve the error messaging when a subzone config is mismatching - namely, add index and partitioning information and differentiate between missing fields and missing / extraneous zone configs Release note (bug fix): Fixed validation bugs during ALTER TABLE ... SET LOCALITY / crdb_internal.validate_multi_region_zone_config where validation errors could occur when the database of a REGIONAL BY ROW table has a new region added. Also fix a validation bug partition zone mismatches configs were not caught.
f47e5c3
to
16808f5
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewed 2 of 3 files at r4.
Reviewable status: complete! 0 of 0 LGTMs obtained (waiting on @adityamaru, @ajstorm, and @otan)
pkg/config/zonepb/zone.go, line 826 at r3 (raw file):
The zone configs giveth, and the zone configs taketh away. Now we just have to figure out how to get them to giveth a bit.
pkg/sql/region_util.go, line 1349 at r3 (raw file):
Previously, otan (Oliver Tan) wrote…
There were indexes that are being cleaned up during, e.g. transitioning from RBR -> Global which appears to not get cleaned up immediately, in which the zone config hangs around for a while. If these were a concern, I'd say delete this filtering and you'll see a few tests fail, e.g.
go test ./pkg/ccl/telemetryccl -run 'TestTelemetry/multiregion/server'
Makes sense that we put this in now, but I'm wondering if there's more work to be done here to clean up those zone configs properly before the schema change, like we do for RBR.
pkg/sql/region_util.go, line 1349 at r3 (raw file): Previously, ajstorm (Adam Storm) wrote…
We can't do it before, because if the schema change fails we have to "put them back". they need to be live right until the atomic swap of the indexes. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: complete! 0 of 0 LGTMs obtained (waiting on @adityamaru, @ajstorm, and @otan)
pkg/sql/region_util.go, line 1349 at r3 (raw file):
Previously, otan (Oliver Tan) wrote…
We can't do it before, because if the schema change fails we have to "put them back". they need to be live right until the atomic swap of the indexes.
Doing it after makes the most sense atm...
The issue that we hit in the RBR->Global transition originally, was that the ZCs hung around while the indexes were waiting to be GCed. I thought that we caught those by removing the ZCs when transitioning from RBR. If you're still hitting them, then something's not right.
pkg/sql/region_util.go, line 1349 at r3 (raw file): Previously, ajstorm (Adam Storm) wrote…
Something indeed isn't right (found the bug). I'm inclined to keep this since a) the original bug for that is fixed and b) it doesn't make sense for me to error in this case, and will fix that in a separate issue. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: complete! 0 of 0 LGTMs obtained (waiting on @adityamaru, @ajstorm, and @otan)
pkg/sql/region_util.go, line 1349 at r3 (raw file):
Previously, otan (Oliver Tan) wrote…
Something indeed isn't right (found the bug). I'm inclined to keep this since a) the original bug for that is fixed and b) it doesn't make sense for me to error in this case, and will fix that in a separate issue.
🎉
bors r=ajstorm |
Build succeeded: |
into account which can cause bugs when e.g. ADD REGION adds a subzone
in the end rather than in the old "expected" location in the subzones
array. This has been fixed by comparing subzones using an unordered
map.
original subzone array element, meaning that if there was a mismatch
it would not be reported through validation. This is now fixed by
applying the expected zone config to *zonepb.NewZoneConfig() instead.
active subzone IDs.
namely, add index and partitioning information and differentiate
between missing fields and missing / extraneous zone configs
Resolves #62790
Release note (bug fix): Fixed validation bugs during ALTER TABLE ... SET
LOCALITY / crdb_internal.validate_multi_region_zone_config where
validation errors could occur when the database of a REGIONAL BY ROW
table has a new region added. Also fix a validation bug partition zone
mismatches configs were not caught.