-
Notifications
You must be signed in to change notification settings - Fork 3.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Reapply "bootstrap: create an explicit zoneconfig for timeseries data" #128032
Conversation
f80cd42
to
ce99667
Compare
I started a roachtest run here to make sure the Edit: the test passed |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@rafiss did you have thoughts on whether this change needs to come with a migration or not? I don't have strong feelings either way, but as is, this introduces a difference between clusters that are bootstrapped on 24.3 and clusters that were upgraded to it. cc @andrewbaptist maybe you have thoughts here as well.
Reviewed 12 of 12 files at r1, all commit messages.
Reviewable status: complete! 1 of 0 LGTMs obtained (waiting on @andrewbaptist, @DarrylWong, @rafiss, and @renatolabs)
pkg/cmd/roachtest/tests/gossip.go
line 403 at r1 (raw file):
run(`ALTER RANGE meta %[1]s CONFIGURE ZONE %[2]s 'constraints: {"-rack=0"}'`) run(`ALTER RANGE liveness %[1]s CONFIGURE ZONE %[2]s 'constraints: {"-rack=0"}'`) run(`ALTER RANGE timeseries %[1]s CONFIGURE ZONE %[2]s 'constraints: {"-rack=0"}'`)
Do we need this change? Would we expect RANGE DEFAULT
s constraints to apply to timeseries
as well?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
did you have thoughts on whether this change needs to come with a migration or not?
I felt like it was safer not to include a migration. If a customer has already configured the timeseries range (which we may very well have advised them to do while supporting them), then running an upgrade migration would not be safe, or at best, would be surprising since it would clear out any customizations they have made.
Reviewable status: complete! 1 of 0 LGTMs obtained (waiting on @andrewbaptist, @arulajmani, @DarrylWong, and @renatolabs)
pkg/cmd/roachtest/tests/gossip.go
line 403 at r1 (raw file):
Previously, arulajmani (Arul Ajmani) wrote…
Do we need this change? Would we expect
RANGE DEFAULT
s constraints to apply totimeseries
as well?
this change was needed to make this test pass.
for that matter, the line for TABLE system.public.replication_stats
also is needed, even though the zone config for that table also does not specify any constraints and should inherit.
there must be something in this test setup that ends up requiring each zone config to be updated.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If a customer has already configured the timeseries range (which we may very well have advised them to do while supporting them), then running an upgrade migration would not be safe, or at best, would be surprising since it would clear out any customizations they have made.
I was imagining that the migration wouldn't modify anything if the timeseries
range was unset; we'd only add this explicit entry in system.zones
if there wasn't one already.
Reviewable status: complete! 1 of 0 LGTMs obtained (waiting on @andrewbaptist, @DarrylWong, and @renatolabs)
ce99667
to
79fb797
Compare
c5644af
to
717c58a
Compare
I've added a migration that does this. PTAL. |
717c58a
to
a08397a
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good, modulo question about backup/restore.
Reviewed 2 of 3 files at r2, 17 of 17 files at r4, 17 of 17 files at r5, all commit messages.
Reviewable status: complete! 0 of 0 LGTMs obtained (and 1 stale) (waiting on @andrewbaptist, @DarrylWong, @rafiss, and @renatolabs)
pkg/upgrade/upgrades/upgrades.go
line 107 at r5 (raw file):
clusterversion.V24_3_AddTimeseriesZoneConfig.Version(), addTimeseriesZoneConfig, upgrade.RestoreActionNotRequired("this zone config isn't necessary for restore"),
I may be missing something, but I would have expected there to be a need to add this zone configuration on the restore path as well. My understanding was that we backup and restore zone configurations, so we'd need to do handle adding this zone configuration on the restore path as well.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: complete! 0 of 0 LGTMs obtained (and 1 stale) (waiting on @andrewbaptist, @arulajmani, @DarrylWong, and @renatolabs)
pkg/upgrade/upgrades/upgrades.go
line 107 at r5 (raw file):
Previously, arulajmani (Arul Ajmani) wrote…
I may be missing something, but I would have expected there to be a need to add this zone configuration on the restore path as well. My understanding was that we backup and restore zone configurations, so we'd need to do handle adding this zone configuration on the restore path as well.
i assumed we didn't do that during restores, since the current restore logic also does not add in zone configs for other ranges/objects that may not have zone configs in old versions. for example, it doesn't add in zone configs for the system database or other tables that have zone configs added at bootstrap: https://github.com/cockroachdb/cockroach/blob/master/pkg/sql/catalog/bootstrap/metadata.go#L537-L541
right now it just copies the zones configs from whatever is in the backup.
i can easily change it to add the timeseries zone config, if we want to treat it differently for some reason. let me know what you think makes sense.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for continuing to push on this!
Reviewable status: complete! 1 of 0 LGTMs obtained (waiting on @andrewbaptist, @DarrylWong, @rafiss, and @renatolabs)
pkg/upgrade/upgrades/upgrades.go
line 107 at r5 (raw file):
Previously, rafiss (Rafi Shamim) wrote…
i assumed we didn't do that during restores, since the current restore logic also does not add in zone configs for other ranges/objects that may not have zone configs in old versions. for example, it doesn't add in zone configs for the system database or other tables that have zone configs added at bootstrap: https://github.com/cockroachdb/cockroach/blob/master/pkg/sql/catalog/bootstrap/metadata.go#L537-L541
right now it just copies the zones configs from whatever is in the backup.
i can easily change it to add the timeseries zone config, if we want to treat it differently for some reason. let me know what you think makes sense.
Ack. I don't think there's a good reason to treat this differently. I'm just surprised that's how the current restore logic behaves. Would you mind filing an issue with DR if you agree?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: complete! 1 of 0 LGTMs obtained (waiting on @andrewbaptist, @arulajmani, @DarrylWong, and @renatolabs)
pkg/upgrade/upgrades/upgrades.go
line 107 at r5 (raw file):
Previously, arulajmani (Arul Ajmani) wrote…
Ack. I don't think there's a good reason to treat this differently. I'm just surprised that's how the current restore logic behaves. Would you mind filing an issue with DR if you agree?
sure thing! filed #129084
a08397a
to
3690cbb
Compare
This partially reverts commit f9d47ce. This time, rather than copying over the full zone config for the default range to the timeseries range, we only copy over gc.ttlseconds. This means that other attributes, most notably the replication factor, will be inherited from the default range. This will make it more clear that the timeseries zone config can be changed independently from all the other zone configs. Release note (ops change): New clusters that are initialized for the first time will now have a zone config defined for the `timeseries` range. This zone config only specifies the gc.ttlseconds, so all the other attributes are inherited from the zone config of the `default` range, as they were before. Clusters that are upgraded to v24.3 from a previous version will also have this zone configuration applied to the timeseries range, as long as that range does not already have a zone config.
When upgrading to 24.3, the upgrade will create an explicit zone configuration for the timeseries range if one does not yet exist. It will match the zone config that is created during cluster bootstrap. The release note for this is in the previous commit. Release note: None
3690cbb
to
0ae91a1
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: complete! 0 of 0 LGTMs obtained (and 1 stale) (waiting on @andrewbaptist, @arulajmani, @DarrylWong, and @renatolabs)
pkg/cmd/roachtest/tests/gossip.go
line 403 at r1 (raw file):
Previously, rafiss (Rafi Shamim) wrote…
this change was needed to make this test pass.
for that matter, the line for
TABLE system.public.replication_stats
also is needed, even though the zone config for that table also does not specify any constraints and should inherit.there must be something in this test setup that ends up requiring each zone config to be updated.
actually, i tracked this down too -- by initializing the zone config with zonepb.NewZoneConfig()
rather than with &zonepb.ZoneConfig{}
in the bootstrap code, we get the proper inheritance, so this test change isn't needed after all.
TFTR! bors r+ |
Build failed (retrying...): |
This partially reverts commit f9d47ce.
This time, rather than copying over the full zone config for the default range to the timeseries range, we only copy over gc.ttlseconds. This means that other attributes, most notably the replication factor, will be inherited from the default range.
This will make it more clear that the timeseries zone config can be changed independently from all the other zone configs.
fixes #123762
Release note (ops change): New clusters that are initialized for the first time will now have a zone config defined for the
timeseries
range. This zone config only specifies the gc.ttlseconds, so all the other attributes are inherited from the zone config of thedefault
range, as they were before.Clusters that are upgraded to v24.3 from a previous version will also
have this zone configuration applied to the timeseries range, as long as
that range does not already have a zone config.