Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add docs for migrating to v21.1+ multiregion SQL #11057

Merged
merged 1 commit into from
Sep 17, 2021

Conversation

rmloveland
Copy link
Contributor

@rmloveland rmloveland commented Aug 16, 2021

Fixes #11017

Summary of changes:

  • Add a new page, 'Migrate to Multi-Region SQL', which has:

    • Mappings from each of the legacy replication-zone-based patterns and
      the newer multi-region SQL abstractions that are designed to replace
      them.

    • Instructions for migrating from replication-zone-based workflows to
      multi-region SQL abstractions.

    • A review of the zone configuration changes that result from running
      the multi-region SQL statements.

@cockroach-teamcity
Copy link
Member

This change is Reviewable

@github-actions
Copy link

github-actions bot commented Aug 16, 2021

@rmloveland rmloveland force-pushed the 20210812-migrate-to-multiregion-sql-abstractions branch from d9e741f to 7a43492 Compare August 18, 2021 16:16
@netlify
Copy link

netlify bot commented Aug 18, 2021

✔️ Netlify Preview

🔨 Explore the source changes: fb76b1a

🔍 Inspect the deploy log: https://app.netlify.com/sites/cockroachdb-docs/deploys/6143a7bf4b0fd2000787ff45

😎 Browse the preview: https://deploy-preview-11057--cockroachdb-docs.netlify.app

@rmloveland rmloveland force-pushed the 20210812-migrate-to-multiregion-sql-abstractions branch 2 times, most recently from c29f9b6 to 75f4e26 Compare August 19, 2021 19:47
@rmloveland rmloveland marked this pull request as ready for review August 19, 2021 19:48
Copy link
Contributor

@awoods187 awoods187 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewed 2 of 9 files at r1, all commit messages.
Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @rmloveland)


v21.1/migrate-to-multiregion-sql.md, line 135 at r1 (raw file):

~~~

### Step 5. Configure your database survival goal

Can we mark this optional? Or at least mention you only need to adjust it if your business needs require it.


v21.1/migrate-to-multiregion-sql.md, line 174 at r1 (raw file):

### Step 7. (Optional) View the updated zone configurations

The multi-region SQL statements operate by controlling the same replication zone configurations that you already have access to via the [`ALTER TABLE ... CONFIGURE ZONE`](configure-zone.html) statement. If you are interested in seeing how they work with the lower-level zone config mechanisms, you can use the [`SHOW ZONE CONFIGURATIONS`](show-zone-configurations.html) statement to view the zone configurations.

Maybe we recommend no longer adjusting zone configs using the new patterns?

@awoods187
Copy link
Contributor

A few small comments but this looks really good

Copy link
Member

@nvanbenschoten nvanbenschoten left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agreed, this looks great!

Reviewed 9 of 9 files at r1, all commit messages.
Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @ajstorm, @awoods187, and @rmloveland)


_includes/v21.1/zone-configs/variables.md, line 9 at r1 (raw file):

`constraints` | <a name="constraints"></a> An array of required (`+`) and/or prohibited (`-`) constraints influencing the location of replicas. See [Types of Constraints](configure-replication-zones.html#types-of-constraints) and [Scope of Constraints](configure-replication-zones.html#scope-of-constraints) for more details.<br/><br/>To prevent hard-to-detect typos, constraints placed on [store attributes and node localities](configure-replication-zones.html#descriptive-attributes-assigned-to-nodes) must match the values passed to at least one node in the cluster. If not, an error is signalled. To prevent this error, make sure at least one active node is configured to match the constraint. For example, apply `constraints = '[+region=west]'` only if you had set `--locality=region=west` for at least one node while starting the cluster.<br/><br/>**Default:** No constraints, with CockroachDB locating each replica on a unique node and attempting to spread replicas evenly across localities.
`lease_preferences` <a name="lease_preferences"></a> | An ordered list of required and/or prohibited constraints influencing the location of [leaseholders](architecture/overview.html#glossary).  Whether each constraint is required or prohibited is expressed with a leading `+` or `-`, respectively.  Note that lease preference constraints do not have to be shared with the `constraints` field.  For example, it's valid for your configuration to define a `lease_preferences` field that does not reference any values from the `constraints` field.  It's also valid to define a `lease_preferences` field with no `constraints` field at all. <br /><br />  If the first preference cannot be satisfied, CockroachDB will attempt to satisfy the second preference, and so on.  If none of the preferences can be met, the lease will be placed using the default lease placement algorithm, which is to base lease placement decisions on how many leases each node already has, trying to make all the nodes have around the same amount.<br /><br />Each value in the list can include multiple constraints.  For example, the list `[[+zone=us-east-1b, +ssd], [+zone=us-east-1a], [+zone=us-east-1c, +ssd]]` means "prefer nodes with an SSD in `us-east-1b`, then any nodes in `us-east-1a`, then nodes in `us-east-1c` with an SSD."<br /><br /> For a usage example, see [Constrain leaseholders to specific availability zones](configure-replication-zones.html#constrain-leaseholders-to-specific-availability-zones).<br /><br />**Default**: No lease location preferences are applied if this field is not specified.
`global_reads` | <a name="global_reads"></a> <span class="version-tag">New in v21.1:</span> If `true`, transactions operating on the range(s) affected by this zone config should be non-blocking. Most users will not need to modify this setting; it is applied automatically when you [use the `GLOBAL` table locality in a multi-region cluster](multiregion-overview.html#table-locality).

I wonder whether we should say a little bit more here about what this means for reads and writes. We say "transactions ... should be non-blocking", but that doesn't really say much about how this slows down writes but allows reads from any replica in the range. Maybe we link to the GLOBAL tables docs that spell out this trade-off in more detail?


v21.1/migrate-to-multiregion-sql.md, line 30 at r1 (raw file):

--- | ---
[Duplicate Indexes][dupe_index] | [`GLOBAL` tables](global-tables.html)
[Geo-partitioned replicas][geo_replicas] | [`REGIONAL` tables](regional-tables.html) with [`ZONE` survival goals](multiregion-overview.html#surviving-zone-failures)

Do you think that we should mention REGIONAL BY ROW here? Or link to the REGIONAL BY ROW portion of the referenced page?


v21.1/migrate-to-multiregion-sql.md, line 92 at r1 (raw file):

    `ALTER TABLE users PARTITION BY NOTHING;`

2. Remove the manually created partition on the secondary index. This will also automatically remove the replication zone configurations that were applied to the partition as part of the instructions.

s/index/indexes/

Same below.


v21.1/migrate-to-multiregion-sql.md, line 135 at r1 (raw file):

Previously, awoods187 (Andy Woods) wrote…

Can we mark this optional? Or at least mention you only need to adjust it if your business needs require it.

Or at least mention that ZONE is the default.

Copy link
Contributor

@ajstorm ajstorm left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewed 2 of 9 files at r1.
Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @awoods187, @nvanbenschoten, and @rmloveland)


v21.1/migrate-to-multiregion-sql.md, line 9 at r1 (raw file):

## Overview

In CockroachDB v21.1, we added support for [improved multi-region capabilities that make it easier to run global applications](multiregion-overview.html). Using high-level SQL statements, you can control where your data is stored and how it is accessed to provide good performance and low latency for your application's users.

Nit: Good performance and low latency are sometimes the same thing. Maybe "good performance and tuneable latency"?


v21.1/migrate-to-multiregion-sql.md, line 11 at r1 (raw file):

In CockroachDB v21.1, we added support for [improved multi-region capabilities that make it easier to run global applications](multiregion-overview.html). Using high-level SQL statements, you can control where your data is stored and how it is accessed to provide good performance and low latency for your application's users.

Prior to v21.1, the only way to accomplish these goals in a multi-region cluster involved using lower-level mechanisms called [replication zones](configure-replication-zones.html). Replication zones are still fully supported. However, for most users, they are harder to use and in some cases can result in worse performance than the multi-region SQL abstractions.

Do we also want to call out the old topology patterns explicitly? That may resonate with people more than the raw zone configs.


v21.1/migrate-to-multiregion-sql.md, line 22 at r1 (raw file):

{{site.data.alerts.callout_info}}
If you are already using multi-region SQL statements to control your multi-region cluster, you can ignore this page.

Nit: Does it make sense to move this right up to the top of the document? Might help short-circuit things faster.


v21.1/migrate-to-multiregion-sql.md, line 30 at r1 (raw file):

Previously, nvanbenschoten (Nathan VanBenschoten) wrote…

Do you think that we should mention REGIONAL BY ROW here? Or link to the REGIONAL BY ROW portion of the referenced page?

+1


v21.1/migrate-to-multiregion-sql.md, line 31 at r1 (raw file):

[Duplicate Indexes][dupe_index] | [`GLOBAL` tables](global-tables.html)
[Geo-partitioned replicas][geo_replicas] | [`REGIONAL` tables](regional-tables.html) with [`ZONE` survival goals](multiregion-overview.html#surviving-zone-failures)
[Geo-partitioned leaseholders][geo_leaseholders] | [`REGIONAL` tables](regional-tables.html) with [`REGION` survival goals](multiregion-overview.html#surviving-region-failures)

Does it also make sense to mention here that "Follow the workload" has no counterpart in a multi-region database?


v21.1/migrate-to-multiregion-sql.md, line 43 at r1 (raw file):

### Performance considerations

Expect performance to be temporarily affected by this process. When you run the commands described below, the cluster may need to do a significant amount of work to meet the new replica placement constraints it has just been given.

Do we need to say anything here about domiciling issues? The data may move out of a geo temporarily as part of this migration.


v21.1/migrate-to-multiregion-sql.md, line 47 at r1 (raw file):

Depending on how long you are willing to wait between steps for the replica rebalancing process to settle down, data movement may still be occurring when the next set of constraints are added using the multi-region SQL abstractions; this may lead to further data movement.

In other words, the process described here may result in a significant increase in CPU usage, IOPS, and network traffic while the cluster rebalances replicas to meet the final set of constraints you have provided it with. Until this process completes, the cluster may not be able to handle its normal workload. Therefore, we recommend running this procedure at a time when you would normally perform scheduled maintenance.

We might also want to mention here that this work must be done before you transition the database to have associated regions (and that it has to be done for all tables).


v21.1/migrate-to-multiregion-sql.md, line 69 at r1 (raw file):

    {% include_cached copy-clipboard.html %}
    ~~~ sql
    ALTER TABLE postal_codes CONFIGURE ZONE USING constraints = '[]', lease_preferences='[]';

This may not actually work, as this might (I'm not sure) still look like a modified zone configuration. Would it be safer to use CONFIGURE ZONE DISCARD?

@otan thoughts?


v21.1/migrate-to-multiregion-sql.md, line 97 at r1 (raw file):

    `ALTER INDEX users_last_name_index PARTITION BY NOTHING;`

The latency and resiliency benefits of the geo-partitioned replicas pattern can be replaced by making `users` a [`REGIONAL` table](regional-tables.html) with a [`ZONE` survival goal](multiregion-overview.html#surviving-zone-failures).

Do we worry that these instructions may be only part of the story? In some cases will users need to modify their schemas to take into account the crdb_region column? Or create the table using the AS clause?


v21.1/migrate-to-multiregion-sql.md, line 139 at r1 (raw file):

Depending on your database's [survival goal](multiregion-overview.html#survival-goals), you will want to choose one of the following:

- [`ZONE` survival](multiregion-overview.html#surviving-zone-failures): the database will remain fully available for reads and writes if one zone in a region goes down. More than one zone going down may affect availability.

"More than one zone going down concurrently ...?"


v21.1/migrate-to-multiregion-sql.md, line 140 at r1 (raw file):

- [`ZONE` survival](multiregion-overview.html#surviving-zone-failures): the database will remain fully available for reads and writes if one zone in a region goes down. More than one zone going down may affect availability.
- [`REGION` survival](multiregion-overview.html#surviving-region-failures): the database will remain fully available for reads and writes if one region goes down. More than one region going down may affect availability.

"More than one region going down concurrently ...?"


v21.1/migrate-to-multiregion-sql.md, line 174 at r1 (raw file):

Previously, awoods187 (Andy Woods) wrote…

Maybe we recommend no longer adjusting zone configs using the new patterns?

We might want to consider dropping this part entirely. I worry that this may cause customers to do some digging into places that we don't want them (long term). For customers who know that things work this way, they'll know how to get this info.

Copy link
Member

@nvanbenschoten nvanbenschoten left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @awoods187, @nvanbenschoten, @otan, and @rmloveland)


v21.1/migrate-to-multiregion-sql.md, line 69 at r1 (raw file):

Previously, ajstorm (Adam Storm) wrote…

This may not actually work, as this might (I'm not sure) still look like a modified zone configuration. Would it be safer to use CONFIGURE ZONE DISCARD?

@otan thoughts?

I was thinking along these lines as well. What we have here will work to set constraints and lease_preferences to empty values, but that's not quite the same as unsetting these fields. In other words, constraints from the parent database will not be inherited even after this is run. CONFIGURE ZONE DISCARD will clear these fields and allow inheritance to work properly, but it will throw away all other fields, which may not be desirable. I couldn't find a way to discard specific fields in a zone config. Maybe @otan knows how to do so.

Copy link
Contributor

@awoods187 awoods187 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @ajstorm, @awoods187, @nvanbenschoten, @otan, and @rmloveland)


v21.1/migrate-to-multiregion-sql.md, line 11 at r1 (raw file):

Previously, ajstorm (Adam Storm) wrote…

Do we also want to call out the old topology patterns explicitly? That may resonate with people more than the raw zone configs.

Yes...I'd keep what's written but add in the previous pattern names.


v21.1/migrate-to-multiregion-sql.md, line 22 at r1 (raw file):

Previously, ajstorm (Adam Storm) wrote…

Nit: Does it make sense to move this right up to the top of the document? Might help short-circuit things faster.

+1


v21.1/migrate-to-multiregion-sql.md, line 31 at r1 (raw file):

Previously, ajstorm (Adam Storm) wrote…

Does it also make sense to mention here that "Follow the workload" has no counterpart in a multi-region database?

yes--we list that elsewhere but worth making it explicit here


v21.1/migrate-to-multiregion-sql.md, line 43 at r1 (raw file):

Previously, ajstorm (Adam Storm) wrote…

Do we need to say anything here about domiciling issues? The data may move out of a geo temporarily as part of this migration.

probably but i'd prefer to minimally mention it and link the DD doc


v21.1/migrate-to-multiregion-sql.md, line 47 at r1 (raw file):

Previously, ajstorm (Adam Storm) wrote…

We might also want to mention here that this work must be done before you transition the database to have associated regions (and that it has to be done for all tables).

+1


v21.1/migrate-to-multiregion-sql.md, line 97 at r1 (raw file):

Previously, ajstorm (Adam Storm) wrote…

Do we worry that these instructions may be only part of the story? In some cases will users need to modify their schemas to take into account the crdb_region column? Or create the table using the AS clause?

Its probably good to mention that if you have a column already you can use and link to that example


v21.1/migrate-to-multiregion-sql.md, line 174 at r1 (raw file):

Previously, ajstorm (Adam Storm) wrote…

We might want to consider dropping this part entirely. I worry that this may cause customers to do some digging into places that we don't want them (long term). For customers who know that things work this way, they'll know how to get this info.

yeah im conovinced that we should just drop based on this argument

Fixes #11017

Summary of changes:

- Add a new page, 'Migrate to Multi-Region SQL', which has:

  - Mappings from each of the legacy replication-zone-based patterns and
    the newer multi-region SQL abstractions that are designed to replace
    them.

  - Instructions for migrating from replication-zone-based workflows to
    multi-region SQL abstractions.

  - A review of the zone configuration changes that result from running
    the multi-region SQL statements.
@rmloveland rmloveland force-pushed the 20210812-migrate-to-multiregion-sql-abstractions branch from 75f4e26 to fb76b1a Compare September 16, 2021 20:23
Copy link
Contributor Author

@rmloveland rmloveland left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks! And thanks for the great review Andy, Nathan, and Adam!

I think all feedback has been pretty well addressed except that I kinda disagree about removing the last section with the "here are the new zone configs" explanation (see rationale in comment below)

also I went with the CONFIGURE ZONE DISCARD option in that one discussion below since that seems way easier to explain

PTAL and let me know if you would like any additional edits!

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @ajstorm, @awoods187, @nvanbenschoten, and @otan)


_includes/v21.1/zone-configs/variables.md, line 9 at r1 (raw file):

Previously, nvanbenschoten (Nathan VanBenschoten) wrote…

I wonder whether we should say a little bit more here about what this means for reads and writes. We say "transactions ... should be non-blocking", but that doesn't really say much about how this slows down writes but allows reads from any replica in the range. Maybe we link to the GLOBAL tables docs that spell out this trade-off in more detail?

Updated as follows:

  • to add what you said about "slows down writes but allows reads from any replica"
  • "non-blocking transactions" is now a link to more details in the Transaction Layer docs (it didn't yet exist when we added this doc)
  • the "GLOBAL tables" link now points to the main GLOBAL tables doc as you suggested

v21.1/migrate-to-multiregion-sql.md, line 9 at r1 (raw file):

Previously, ajstorm (Adam Storm) wrote…

Nit: Good performance and low latency are sometimes the same thing. Maybe "good performance and tuneable latency"?

went with it! thanks!


v21.1/migrate-to-multiregion-sql.md, line 11 at r1 (raw file):

Previously, awoods187 (Andy Woods) wrote…

Yes...I'd keep what's written but add in the previous pattern names.

Updated to say something like "using low-level mechanisms called replication zones in specific patterns we called Duplicate Indexes, ... etc"


v21.1/migrate-to-multiregion-sql.md, line 22 at r1 (raw file):

Previously, awoods187 (Andy Woods) wrote…

+1

Done!


v21.1/migrate-to-multiregion-sql.md, line 30 at r1 (raw file):

Previously, ajstorm (Adam Storm) wrote…

+1

Changed to say RBR and link to that section of the referenced page

Also updated the "Geo-partitioned replicas" section below to say RBR and link to that section


v21.1/migrate-to-multiregion-sql.md, line 31 at r1 (raw file):

Previously, awoods187 (Andy Woods) wrote…

yes--we list that elsewhere but worth making it explicit here

Updated using mostly similar language to what is on the FtW page - lemme know what you think!


v21.1/migrate-to-multiregion-sql.md, line 43 at r1 (raw file):

Previously, awoods187 (Andy Woods) wrote…

probably but i'd prefer to minimally mention it and link the DD doc

Added a little info block at the bottom of this section that's like "hey FYI this data may move out of your specified geos temporarily, check out our domiciling page"


v21.1/migrate-to-multiregion-sql.md, line 47 at r1 (raw file):

Previously, awoods187 (Andy Woods) wrote…

+1

I'm not sure I fully understand, do you mean that basically the data movement from removing the old has to be complete before you can add the new? at least that's how I'm parsing it. I added a section to that effect - PTAL and see if that's close


v21.1/migrate-to-multiregion-sql.md, line 69 at r1 (raw file):

Previously, nvanbenschoten (Nathan VanBenschoten) wrote…

I was thinking along these lines as well. What we have here will work to set constraints and lease_preferences to empty values, but that's not quite the same as unsetting these fields. In other words, constraints from the parent database will not be inherited even after this is run. CONFIGURE ZONE DISCARD will clear these fields and allow inheritance to work properly, but it will throw away all other fields, which may not be desirable. I couldn't find a way to discard specific fields in a zone config. Maybe @otan knows how to do so.

Based on this discussion I have updated this to use CONFIGURE ZONE DISCARD and added a disclaimer to the effect that "hey if you have other custom zone configs, you will have to reapply them" which is hopefully clear enough.

Sounds like there is nuance here but this seems like the clearest instruction to provide - shout if you object


v21.1/migrate-to-multiregion-sql.md, line 92 at r1 (raw file):

Previously, nvanbenschoten (Nathan VanBenschoten) wrote…

s/index/indexes/

Same below.

Updated!


v21.1/migrate-to-multiregion-sql.md, line 97 at r1 (raw file):

Previously, awoods187 (Andy Woods) wrote…

Its probably good to mention that if you have a column already you can use and link to that example

Updated this section with a note mentioning this and linking to the instructions on the SET LOCALITY page

Also updated this slightly based on a comment from Nathan above to mention REGIONAL BY ROW (not just "regional") and link to RBR info


v21.1/migrate-to-multiregion-sql.md, line 135 at r1 (raw file):

Previously, nvanbenschoten (Nathan VanBenschoten) wrote…

Or at least mention that ZONE is the default.

Marked as optional in the header and revised the intro sentence and first bullet to make clear that ZONE is the default


v21.1/migrate-to-multiregion-sql.md, line 139 at r1 (raw file):

Previously, ajstorm (Adam Storm) wrote…

"More than one zone going down concurrently ...?"

Added the concurrently!


v21.1/migrate-to-multiregion-sql.md, line 140 at r1 (raw file):

Previously, ajstorm (Adam Storm) wrote…

"More than one region going down concurrently ...?"

Updated!


v21.1/migrate-to-multiregion-sql.md, line 174 at r1 (raw file):

Previously, awoods187 (Andy Woods) wrote…

yeah im conovinced that we should just drop based on this argument

Hmmm not sure I agree. Previously, I have argued along the same lines - basically, "If you know enough to be touching this knife I shouldn't need to tell you not to cut yourself"

But apparently that wasn't true, right? In some respects this whole page existing is a response to folks who were using zone configs feeling like there was not enough transitional material to allow them to move across to the new concepts. It's not a page for new users, it's for existing zone config users who needed more educational material so they'd feel comfortable transitioning.

The explicit goal for this section is to say "look we have moved your cheese, but it's ok, it's still the same cheese. Do not be afraid. You can still look at the configs and see what is happening - but yay, now you don't have to."

This page's whole existence is for the short term so we can help folks transition for the longer term

And in the long term we can delete this horrible, horrible page :-)

If you guys really feel strongly about it I'll remove it. But I think our disclaimer at the top of this page about "if you are using MR SQL, you can ignore this" should already make clear this is not for new users.

Copy link
Contributor

@awoods187 awoods187 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @ajstorm, @awoods187, @nvanbenschoten, @otan, and @rmloveland)


v21.1/migrate-to-multiregion-sql.md, line 174 at r1 (raw file):

Previously, rmloveland (Rich Loveland) wrote…

Hmmm not sure I agree. Previously, I have argued along the same lines - basically, "If you know enough to be touching this knife I shouldn't need to tell you not to cut yourself"

But apparently that wasn't true, right? In some respects this whole page existing is a response to folks who were using zone configs feeling like there was not enough transitional material to allow them to move across to the new concepts. It's not a page for new users, it's for existing zone config users who needed more educational material so they'd feel comfortable transitioning.

The explicit goal for this section is to say "look we have moved your cheese, but it's ok, it's still the same cheese. Do not be afraid. You can still look at the configs and see what is happening - but yay, now you don't have to."

This page's whole existence is for the short term so we can help folks transition for the longer term

And in the long term we can delete this horrible, horrible page :-)

If you guys really feel strongly about it I'll remove it. But I think our disclaimer at the top of this page about "if you are using MR SQL, you can ignore this" should already make clear this is not for new users.

I can follow that logic rich. I'll defer to your expertise here.

Copy link
Member

@nvanbenschoten nvanbenschoten left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

:lgtm:

Reviewed 4 of 4 files at r2, all commit messages.
Reviewable status: :shipit: complete! 1 of 0 LGTMs obtained (waiting on @ajstorm, @otan, and @rmloveland)

@rmloveland rmloveland merged commit 4fe0b1b into master Sep 17, 2021
Copy link
Contributor

@ajstorm ajstorm left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: :shipit: complete! 1 of 0 LGTMs obtained


v21.1/migrate-to-multiregion-sql.md, line 47 at r1 (raw file):

Previously, rmloveland (Rich Loveland) wrote…

I'm not sure I fully understand, do you mean that basically the data movement from removing the old has to be complete before you can add the new? at least that's how I'm parsing it. I added a section to that effect - PTAL and see if that's close

Sorry, I wasn't clear. The data movement doesn't need to be complete, but the dropping of the zone configs must be complete. I know that this has already merged, but probably worth another minor update to correct this part.


v21.1/migrate-to-multiregion-sql.md, line 174 at r1 (raw file):

Previously, awoods187 (Andy Woods) wrote…

I can follow that logic rich. I'll defer to your expertise here.

I buy your argument Rich. LGTM.

@rmloveland rmloveland requested a review from taroface September 21, 2021 14:31
@rmloveland
Copy link
Contributor Author

Hey @taroface would appreciate your review of this - we had to merge so it was available for a preso to some internal folks yesterday, but it would be improved by your input

Copy link
Contributor

@taroface taroface left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just some suggestions and a couple formatting corrections!

Reviewable status: :shipit: complete! 1 of 0 LGTMs obtained


v21.1/migrate-to-multiregion-sql.md, line 13 at r2 (raw file):

In CockroachDB v21.1, we added

How do you feel about the more passive "CockroachDB v21.1 added"? Not sure if we refer to ourselves as "we" very often in the docs.


v21.1/migrate-to-multiregion-sql.md, line 15 at r2 (raw file):

in specific patterns we called

Ditto re: "we". I feel it adds a slight barrier in this case since it means the reader starts to wonder about the choices we made/are making.


v21.1/migrate-to-multiregion-sql.md, line 17 at r2 (raw file):

worse performance

Is it possible to be more precise on what "worse" means here, or link to another doc? I'm wondering if someone already using replication zones and considering this migration would want more to go on.


v21.1/migrate-to-multiregion-sql.md, line 19 at r2 (raw file):

This page has instructions for transitioning from controlling multi-region clusters using replication-zone-based workflows to using multi-region SQL abstractions.

Maybe "This page describes how to migrate a multi-region cluster from using replication zones to using multi-region SQL abstractions."


v21.1/migrate-to-multiregion-sql.md, line 21 at r2 (raw file):

Mappings from

"Mappings between"?


v21.1/migrate-to-multiregion-sql.md, line 36 at r2 (raw file):

there is no analogous behavior provided by the multi-region SQL statements that matches Follow the Workload.

"the multi-region SQL statements do not provide a behavior that is analogous to follow-the-workload."


v21.1/migrate-to-multiregion-sql.md, line 36 at r2 (raw file):

Follow the Workload

Looks like we style this follow-the-workload.


v21.1/migrate-to-multiregion-sql.md, line 51 at r2 (raw file):

next set of constraints are added

are -> is


v21.1/migrate-to-multiregion-sql.md, line 53 at r2 (raw file):

Therefore, we recommend running this procedure at a time when you would normally perform scheduled maintenance.

Is it worth bumping this sentence to the top of the section?


v21.1/migrate-to-multiregion-sql.md, line 75 at r2 (raw file):

For example, 

I'd suggest cutting this


v21.1/migrate-to-multiregion-sql.md, line 100 at r2 (raw file):

1. Remove the manually created table partition. This will also automatically remove the replication zone configurations that were applied to the partition as part of the instructions.

Looks like an extra line break here


v21.1/migrate-to-multiregion-sql.md, line 102 at r2 (raw file):

    {% include_cached copy-clipboard.html %}
    `ALTER TABLE users PARTITION BY NOTHING;`

This needs to be bracketed with ~~~ sql and ~~~. Also, remove code ticks.


v21.1/migrate-to-multiregion-sql.md, line 107 at r2 (raw file):

    {% include_cached copy-clipboard.html %}
    `ALTER INDEX users_last_name_index PARTITION BY NOTHING;`

Ditto


v21.1/migrate-to-multiregion-sql.md, line 122 at r2 (raw file):

    {% include_cached copy-clipboard.html %}
    `ALTER TABLE users PARTITION BY NOTHING;`

Ditto


v21.1/migrate-to-multiregion-sql.md, line 127 at r2 (raw file):

    {% include_cached copy-clipboard.html %}
    `ALTER INDEX users_last_name_index PARTITION BY NOTHING;`

Ditto


v21.1/migrate-to-multiregion-sql.md, line 133 at r2 (raw file):

already

Nit: Maybe cut this word


v21.1/migrate-to-multiregion-sql.md, line 142 at r2 (raw file):

~~~

### Step 4. Add more regions to the database

This should be Step 3, unless there's a missing step before this.


v21.1/migrate-to-multiregion-sql.md, line 151 at r2 (raw file):

~~~

### Step 5. (Optional) Configure your database survival goal

This should be Step 4.


v21.1/migrate-to-multiregion-sql.md, line 168 at r2 (raw file):

### Step 6. Configure table localities

This should be Step 5.


v21.1/migrate-to-multiregion-sql.md, line 188 at r2 (raw file):

For more information about when to use `GLOBAL` vs. `REGIONAL` tables, see [When to use `REGIONAL` vs `GLOBAL` tables](when-to-use-regional-vs-global-tables.html).

### Step 7. (Optional) View the updated zone configurations

This should be Step 6.


v21.1/migrate-to-multiregion-sql.md, line 190 at r2 (raw file):

The multi-region SQL statements operate by controlling the same replication zone configurations that you already have access to via

Maybe "The multi-region SQL statements operate on the same replication zone configurations that you can access via" (if this is still accurate)


v21.1/migrate-to-multiregion-sql.md, line 192 at r2 (raw file):

Low-latency reads and writes in a multi-region cluster

This should be title case to match with the earlier mention of this article. Also, no hyphen after "Low" according to the article title.

Copy link
Contributor Author

@rmloveland rmloveland left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the additional comments Ryan and Adam! I have put the new stuff in a draft PR #11767

Adam, I had more questions for you in our discussion below re: "how to know zone configs are done dropping" and will then update the above PR with that info

Reviewable status: :shipit: complete! 1 of 0 LGTMs obtained


v21.1/migrate-to-multiregion-sql.md, line 47 at r1 (raw file):

Previously, ajstorm (Adam Storm) wrote…

Sorry, I wasn't clear. The data movement doesn't need to be complete, but the dropping of the zone configs must be complete. I know that this has already merged, but probably worth another minor update to correct this part.

Ah ok. In that case: as a user, how do I know "dropping the old zone configs" is actually complete? I think all I have is SHOW ZONE CONFIGURATIONS right?

In other words, in a process like the below, where I have e.g. 10 configs on a table foo:

  1. SHOW ZONE CONFIGURATIONS FROM TABLE foo;
  2. ALTER TABLE foo CONFIGURE ZONE DISCARD
  3. SHOW ZONE CONFIGURATIONS FROM TABLE foo;
  4. Repeat step 3 until done?

does the SHOW ZONE in step 3 continue to return a shorter and shorter list of zone configs as each config is dropped, such that I can "poll" by running that command every second and watch it go 10 .. 7 ... 3 ... etc down to 0?

or does SHOW ZONE just tell me "all good, everything is dropped" and then that process actually happens async? in which case is there some other user-visible proxy for "zone configs are done being dropped"?


v21.1/migrate-to-multiregion-sql.md, line 13 at r2 (raw file):

Previously, taroface (Ryan Kuo) wrote…
In CockroachDB v21.1, we added

How do you feel about the more passive "CockroachDB v21.1 added"? Not sure if we refer to ourselves as "we" very often in the docs.

Works for me - updated


v21.1/migrate-to-multiregion-sql.md, line 15 at r2 (raw file):

Previously, taroface (Ryan Kuo) wrote…
in specific patterns we called

Ditto re: "we". I feel it adds a slight barrier in this case since it means the reader starts to wonder about the choices we made/are making.

Updated to remove "we"


v21.1/migrate-to-multiregion-sql.md, line 17 at r2 (raw file):

Previously, taroface (Ryan Kuo) wrote…
worse performance

Is it possible to be more precise on what "worse" means here, or link to another doc? I'm wondering if someone already using replication zones and considering this migration would want more to go on.

Unfortunately we don't have an obvious thing to link to re: the complexity of navigating the "state space" of rolling your own zone configs and tuning/debugging that vs. using something our engineers have designed for you. Maybe we should add more on this? Although a lot of discussion internally AIUI has been about how we may want to continue hiding zone configs from non-expert users and building higher-level features for those users instead

The huge complexity is also why I don't think we should write more here because it's way too much to explain in this space without making this doc horribly long and complex IMO. I think the goal of this doc is to help a user with the transitional workflow, so I'm hoping the explanation of that workflow and how it ultimately maps to zone configs underneath will be enough to convince them - this may be wishful thinking tho :-)


v21.1/migrate-to-multiregion-sql.md, line 19 at r2 (raw file):

Previously, taroface (Ryan Kuo) wrote…
This page has instructions for transitioning from controlling multi-region clusters using replication-zone-based workflows to using multi-region SQL abstractions.

Maybe "This page describes how to migrate a multi-region cluster from using replication zones to using multi-region SQL abstractions."

Nice! Much simpler


v21.1/migrate-to-multiregion-sql.md, line 21 at r2 (raw file):

Previously, taroface (Ryan Kuo) wrote…
Mappings from

"Mappings between"?

went with "mappings from .. to" - I think the between made total sense with the "and" that was there: "between .. and"


v21.1/migrate-to-multiregion-sql.md, line 36 at r2 (raw file):

Previously, taroface (Ryan Kuo) wrote…
there is no analogous behavior provided by the multi-region SQL statements that matches Follow the Workload.

"the multi-region SQL statements do not provide a behavior that is analogous to follow-the-workload."

Nice! Simpler


v21.1/migrate-to-multiregion-sql.md, line 36 at r2 (raw file):

Previously, taroface (Ryan Kuo) wrote…
Follow the Workload

Looks like we style this follow-the-workload.

Ah! Updated


v21.1/migrate-to-multiregion-sql.md, line 51 at r2 (raw file):

Previously, taroface (Ryan Kuo) wrote…
next set of constraints are added

are -> is

Updated!


v21.1/migrate-to-multiregion-sql.md, line 53 at r2 (raw file):

Previously, taroface (Ryan Kuo) wrote…
Therefore, we recommend running this procedure at a time when you would normally perform scheduled maintenance.

Is it worth bumping this sentence to the top of the section?

Great idea - moved it!


v21.1/migrate-to-multiregion-sql.md, line 75 at r2 (raw file):

Previously, taroface (Ryan Kuo) wrote…
For example, 

I'd suggest cutting this

CUT


v21.1/migrate-to-multiregion-sql.md, line 100 at r2 (raw file):

Previously, taroface (Ryan Kuo) wrote…

Looks like an extra line break here

WHAT THE

removed


v21.1/migrate-to-multiregion-sql.md, line 102 at r2 (raw file):

Previously, taroface (Ryan Kuo) wrote…

This needs to be bracketed with ~~~ sql and ~~~. Also, remove code ticks.

Fixed!


v21.1/migrate-to-multiregion-sql.md, line 107 at r2 (raw file):

Previously, taroface (Ryan Kuo) wrote…

Ditto

Fixed!


v21.1/migrate-to-multiregion-sql.md, line 122 at r2 (raw file):

Previously, taroface (Ryan Kuo) wrote…

Ditto

Fixed


v21.1/migrate-to-multiregion-sql.md, line 127 at r2 (raw file):

Previously, taroface (Ryan Kuo) wrote…

Ditto

Fixed


v21.1/migrate-to-multiregion-sql.md, line 142 at r2 (raw file):

Previously, taroface (Ryan Kuo) wrote…

This should be Step 3, unless there's a missing step before this.

Nice catch

ACTUALLY THIS WAS A TEST AND YOU PASSED RYAN

jk I just messed it up


v21.1/migrate-to-multiregion-sql.md, line 151 at r2 (raw file):

Previously, taroface (Ryan Kuo) wrote…

This should be Step 4.

Fixed


v21.1/migrate-to-multiregion-sql.md, line 168 at r2 (raw file):

Previously, taroface (Ryan Kuo) wrote…

This should be Step 5.

Fixed


v21.1/migrate-to-multiregion-sql.md, line 188 at r2 (raw file):

Previously, taroface (Ryan Kuo) wrote…

This should be Step 6.

Fixed


v21.1/migrate-to-multiregion-sql.md, line 190 at r2 (raw file):

Previously, taroface (Ryan Kuo) wrote…
The multi-region SQL statements operate by controlling the same replication zone configurations that you already have access to via

Maybe "The multi-region SQL statements operate on the same replication zone configurations that you can access via" (if this is still accurate)

Updated to use mostly this - removed extra words


v21.1/migrate-to-multiregion-sql.md, line 192 at r2 (raw file):

Previously, taroface (Ryan Kuo) wrote…
Low-latency reads and writes in a multi-region cluster

This should be title case to match with the earlier mention of this article. Also, no hyphen after "Low" according to the article title.

Updated!

Copy link
Contributor

@ajstorm ajstorm left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: :shipit: complete! 1 of 0 LGTMs obtained


v21.1/migrate-to-multiregion-sql.md, line 47 at r1 (raw file):

Previously, rmloveland (Rich Loveland) wrote…

Ah ok. In that case: as a user, how do I know "dropping the old zone configs" is actually complete? I think all I have is SHOW ZONE CONFIGURATIONS right?

In other words, in a process like the below, where I have e.g. 10 configs on a table foo:

  1. SHOW ZONE CONFIGURATIONS FROM TABLE foo;
  2. ALTER TABLE foo CONFIGURE ZONE DISCARD
  3. SHOW ZONE CONFIGURATIONS FROM TABLE foo;
  4. Repeat step 3 until done?

does the SHOW ZONE in step 3 continue to return a shorter and shorter list of zone configs as each config is dropped, such that I can "poll" by running that command every second and watch it go 10 .. 7 ... 3 ... etc down to 0?

or does SHOW ZONE just tell me "all good, everything is dropped" and then that process actually happens async? in which case is there some other user-visible proxy for "zone configs are done being dropped"?

If you have 10 fields of a zone configuration set for table foo, CONFIGURE ZONE DISCARD will drop all of them with a single call. You shouldn't need steps 3 or 4 above. The user will know when it's complete once the CONFIGURE ZONE DISCARD command completes successfully.

rmloveland added a commit that referenced this pull request Sep 27, 2021
rmloveland added a commit that referenced this pull request Sep 28, 2021
@lnhsingh lnhsingh deleted the 20210812-migrate-to-multiregion-sql-abstractions branch June 23, 2022 14:02
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Add docs for migrating to v21.1+ multiregion SQL abstractions
6 participants