Skip to content

Latest commit

 

History

History
94 lines (66 loc) · 5.68 KB

data-domiciling-with-separate-databases.md

File metadata and controls

94 lines (66 loc) · 5.68 KB
title summary toc
Data Domiciling with Separate Databases
Learn how to use CockroachDB's improved multi-region capabilities to implement data domiciling.
true

{% include {{page.version.version}}/sql/data-domiciling-intro.md %}

This page has instructions for data domiciling in multi-region clusters using a separate database per domicile. At a high level, this process involves:

  1. Creating separate databases per domicile.
  2. Adding regions to those databases using the ADD REGION statement.
  3. Making sure your application is adding data meant for a specific domicile to the correct database.

For more information, see the sections below.

Overview

The best way to keep a specific data set cordoned off from others in CockroachDB is also the simplest: keep it in a separate database. Data that needs to meet specific domiciling requirements such as "data from EU residents must be domiciled in the EU" is most easily met by creating a separate database only for EU residents' data.

In a multi-region setting, you can associate that database with only those regions which should accept and store EU user data. Luckily, CockroachDB supports cross-database selection queries, so you can still join this data with other data sets to keep track of what is happening across your application, while meeting the requirement that data is stored on disk in the allowed localities. Depending on your requirements, you may want to further require that even read queries are only generated from the region in which you are allowed to access the data.

For a list of the limitations of using CockroachDB as part of your approach to data domiciling, see Limitations.

Step 1. Create separate databases per domiciled data requirement

As mentioned above, the best way to keep specific data sets cordoned off from each other in CockroachDB is to keep them in separate databases. To create a separate database for EU-based users, run the following statement:

{% include_cached copy-clipboard.html %}

CREATE DATABASE eu_users;
USE eu_users;

Step 2. Add specific regions to each database

Make sure that CockroachDB is storing the data in the eu_users database in European regions using the ALTER DATABASE ... SET PRIMARY REGION and ... ADD REGION statements:

{% include_cached copy-clipboard.html %}

ALTER DATABASE eu_users SET PRIMARY REGION 'eu-west-1';
ALTER DATABASE eu_users ADD REGION 'eu-west-2';
ALTER DATABASE eu_users ADD REGION 'eu-central-1';

{{site.data.alerts.callout_info}} In order to be able to add these regions from SQL, you must have started the cluster with these regions using the cockroach start --locality flag. {{site.data.alerts.end}}

Step 3. Add domiciled data to the right databases

You will need to make sure that user data associated with EU users is only added to the eu_users database.

How exactly you will accomplish that is beyond the scope of this document, but you will likely need to add some logic to your application and/or to your load balancing infrastructure to make sure that when your application code is inserting or updating EU user data, the data only ever hits the eu_users database. For example, you can set the target database in your connection string. For example:

postgres://maxroach:[email protected]:26257/eu_users?sslmode=verify-full&sslrootcert=certs/app-ca.crt

For a more detailed example showing how to deploy an application and CockroachDB in multiple regions, see Deploy a Global, Serverless Application.

Step 4. Query across databases

Storing data on EU users in a separate database is made easier by the fact that CockroachDB supports cross-database joins. For example, to join data from EU and non-EU user tables in a hypothetical application, you might issue a query like the following, which joins users from separate databases on a shared application ID:

{% include_cached copy-clipboard.html %}

SELECT us_users.name, eu_users.name
  FROM us_users.users, eu_users.users
 WHERE us_users.users.application_id = eu_users.users.application_id ...

Limitations

{% include {{page.version.version}}/sql/data-domiciling-limitations.md %}

  • Cross-region writes are slower than intra-region writes. This may be an issue depending on your application's performance needs, since following the advice above would result in having different databases' data stored in different regions.

See also