title | summary | toc |
---|---|---|
Known Limitations in CockroachDB v21.1 |
Learn about newly identified limitations in CockroachDB as well as unresolved limitations identified in earlier releases. |
true |
This page describes newly identified limitations in the CockroachDB {{page.release_info.version}} release as well as unresolved limitations identified in earlier releases.
Left joins and inverted joins involving JSONB
, ARRAY
, or spatial-typed columns with a multi-column or partitioned inverted index will not take advantage of the index if the prefix columns of the index are unconstrained, or if they are constrained to multiple, constant values.
To work around this limitation, make sure that the prefix columns of the index are either constrained to single constant values, or are the same as the input columns.
For example, suppose you have the following multi-region database and tables:
CREATE DATABASE multi_region_test_db PRIMARY REGION "europe-west1" REGIONS "us-west1", "us-east1" SURVIVE REGION FAILURE;
USE multi_region_test_db;
CREATE TABLE t1 (
k INT PRIMARY KEY,
geom GEOMETRY
);
CREATE TABLE t2 (
k INT PRIMARY KEY,
geom GEOMETRY,
INVERTED INDEX geom_idx (geom)
) LOCALITY REGIONAL BY ROW;
And you insert some data into the tables:
INSERT INTO t1 SELECT generate_series(1, 1000), 'POINT(1.0 1.0)';
INSERT INTO t2 (crdb_region, k, geom) SELECT 'us-east1', generate_series(1, 1000), 'POINT(1.0 1.0)';
INSERT INTO t2 (crdb_region, k, geom) SELECT 'us-west1', generate_series(1001, 2000), 'POINT(2.0 2.0)';
INSERT INTO t2 (crdb_region, k, geom) SELECT 'europe-west1', generate_series(2001, 3000), 'POINT(3.0 3.0)';
If you attempt a left join between t1
and t2
on only the geometry columns, CockroachDB will not be able to plan an inverted join:
> EXPLAIN SELECT * FROM t1 LEFT JOIN t2 ON st_contains(t1.geom, t2.geom);
info
------------------------------------
distribution: full
vectorized: true
• cross join (right outer)
│ pred: st_contains(geom, geom)
│
├── • scan
│ estimated row count: 3,000
│ table: t2@primary
│ spans: FULL SCAN
│
└── • scan
estimated row count: 1,000
table: t1@primary
spans: FULL SCAN
(15 rows)
However, if you constrain the crdb_region
column to a single value, CockroachDB can plan an inverted join:
> EXPLAIN SELECT * FROM t1 LEFT JOIN t2 ON st_contains(t1.geom, t2.geom) AND t2.crdb_region = 'us-east1';
info
--------------------------------------------------
distribution: full
vectorized: true
• lookup join (left outer)
│ table: t2@primary
│ equality: (crdb_region, k) = (crdb_region,k)
│ equality cols are key
│ pred: st_contains(geom, geom)
│
└── • inverted join (left outer)
│ table: t2@geom_idx
│
└── • render
│
└── • scan
estimated row count: 1,000
table: t1@primary
spans: FULL SCAN
(18 rows)
If you do not know which region to use, you can combine queries with UNION ALL
:
> EXPLAIN SELECT * FROM t1 LEFT JOIN t2 ON st_contains(t1.geom, t2.geom) AND t2.crdb_region = 'us-east1'
UNION ALL SELECT * FROM t1 LEFT JOIN t2 ON st_contains(t1.geom, t2.geom) AND t2.crdb_region = 'us-west1'
UNION ALL SELECT * FROM t1 LEFT JOIN t2 ON st_contains(t1.geom, t2.geom) AND t2.crdb_region = 'europe-west1';
info
----------------------------------------------------------
distribution: full
vectorized: true
• union all
│
├── • union all
│ │
│ ├── • lookup join (left outer)
│ │ │ table: t2@primary
│ │ │ equality: (crdb_region, k) = (crdb_region,k)
│ │ │ equality cols are key
│ │ │ pred: st_contains(geom, geom)
│ │ │
│ │ └── • inverted join (left outer)
│ │ │ table: t2@geom_idx
│ │ │
│ │ └── • render
│ │ │
│ │ └── • scan
│ │ estimated row count: 1,000
│ │ table: t1@primary
│ │ spans: FULL SCAN
│ │
│ └── • lookup join (left outer)
│ │ table: t2@primary
│ │ equality: (crdb_region, k) = (crdb_region,k)
│ │ equality cols are key
│ │ pred: st_contains(geom, geom)
│ │
│ └── • inverted join (left outer)
│ │ table: t2@geom_idx
│ │
│ └── • render
│ │
│ └── • scan
│ estimated row count: 1,000
│ table: t1@primary
│ spans: FULL SCAN
│
└── • lookup join (left outer)
│ table: t2@primary
│ equality: (crdb_region, k) = (crdb_region,k)
│ equality cols are key
│ pred: st_contains(geom, geom)
│
└── • inverted join (left outer)
│ table: t2@geom_idx
│
└── • render
│
└── • scan
estimated row count: 1,000
table: t1@primary
spans: FULL SCAN
(54 rows)
CockroachDB does not currently support IMPORT
s into REGIONAL BY ROW
tables that are part of multi-region databases.
To work around this limitation, you will need to take the following steps:
-
In the source database, export the
crdb_region
column separately when exporting your data.{% include copy-clipboard.html %}
EXPORT INTO CSV 'nodelocal://0/src_rbr' FROM SELECT crdb_region, i from src_rbr;
For more information about the syntax, see
EXPORT
. -
In the destination database, create a table that has a
crdb_region
column of the right type as shown below.{% include copy-clipboard.html %}
CREATE TABLE dest_rbr (crdb_region public.crdb_internal_region NOT NULL, i INT);
-
Import the data (including the
crdb_region
column explicitly) usingIMPORT INTO
:{% include copy-clipboard.html %}
IMPORT INTO dest_rbr (crdb_region, i) CSV DATA ('nodelocal://0/src_rbr/export*.csv')
-
Convert the destination table to
REGIONAL BY ROW
usingALTER TABLE ... ALTER COLUMN
andALTER TABLE ... SET LOCALITY
:{% include copy-clipboard.html %}
ALTER TABLE dest_rbr ALTER COLUMN crdb_region SET DEFAULT default_to_database_primary_region(gateway_region())::public.crdb_internal_region;
{% include copy-clipboard.html %}
ALTER TABLE dest_rbr SET LOCALITY REGIONAL BY ROW AS crdb_region;
In addition to the limitation above, note that CockroachDB cannot yet make the crdb_region
column hidden in the destination table.
CockroachDB supports the PostgreSQL wire protocol and the majority of its syntax. However, CockroachDB does not support some of the PostgreSQL features or behaves differently from PostgreSQL because not all features can be easily implemented in a distributed system.
For a list of known differences in syntax and behavior between CockroachDB and PostgreSQL, see Features that differ from PostgreSQL.
CockroachDB does not currently support multiple arbiter indexes for INSERT ON CONFLICT DO UPDATE
, and will return an error if there are multiple unique or exclusion constraints matching the ON CONFLICT DO UPDATE
specification.
CockroachDB does not currently support IMPORT
s into tables with partial indexes.
To work around this limitation:
- Drop any partial indexes defined on the table.
- Perform the
IMPORT
. - Recreate the partial indexes.
If you are performing an IMPORT
of a PGDUMP
with partial indexes:
- Drop the partial indexes on the PostgreSQL server.
- Recreate the
PGDUMP
. IMPORT
thePGDUMP
.- Add partial indexes on the CockroachDB server.
{% include {{ page.version.version }}/known-limitations/restore-aost.md %}
CockroachDB supports efficiently storing and querying spatial data, with the following limitations:
-
Not all PostGIS spatial functions are supported.
-
The
AddGeometryColumn
spatial function only allows constant arguments. -
The
AddGeometryColumn
spatial function only allows thetrue
value for itsuse_typmod
parameter. -
CockroachDB does not support the
@
operator. Instead of using@
in spatial expressions, we recommend using the inverse, with~
. For example, instead ofa @ b
, useb ~ a
. -
CockroachDB does not yet support
INSERT
s into thespatial_ref_sys
table. This limitation also blocks theogr2ogr -f PostgreSQL
file conversion command. -
CockroachDB does not yet support
DECLARE CURSOR
, which prevents theogr2ogr
conversion tool from exporting from CockroachDB to certain formats and prevents QGIS from working with CockroachDB. To work around this limitation, export data first to CSV or GeoJSON format. -
CockroachDB does not yet support Triangle or
TIN
spatial shapes. -
CockroachDB does not yet support Curve, MultiCurve, or CircularString spatial shapes.
-
CockroachDB does not yet support k-nearest neighbors.
-
CockroachDB does not support using schema name prefixes to refer to data types with type modifiers (e.g.,
public.geometry(linestring, 4326)
). Instead, use fully-unqualified names to refer to data types with type modifiers (e.g.,geometry(linestring,4326)
).Note that, in
IMPORT PGDUMP
output,GEOMETRY
andGEOGRAPHY
data type names are prefixed bypublic.
. If the type has a type modifier, you must remove thepublic.
from the type name in order for the statements to work in CockroachDB.
It is not currently possible to use a subquery in a SET
or SET CLUSTER SETTING
statement. For example:
{% include copy-clipboard.html %}
> SET application_name = (SELECT 'a' || 'b');
ERROR: invalid value for parameter "application_name": "(SELECT 'a' || 'b')"
SQLSTATE: 22023
DETAIL: subqueries are not allowed in SET
The COMMENT ON
statement associates comments to databases, tables, or columns. However, the internal table (system.comments
) in which these comments are stored is not captured by a BACKUP
of a table or database.
As a workaround, take a cluster backup instead, as the system.comments
table is included in cluster backups.
Change data capture (CDC) provides efficient, distributed, row-level change feeds into Apache Kafka for downstream processing such as reporting, caching, or full-text indexing.
{% include {{ page.version.version }}/known-limitations/cdc.md %}
Accessing the DB Console for a secure cluster now requires login information (i.e., username and password). This login information is stored in a system table that is replicated like other data in the cluster. If a majority of the nodes with the replicas of the system table data go down, users will be locked out of the DB Console.
AS OF SYSTEM TIME
can only be used in a top-level SELECT
statement. That is, we do not support statements like INSERT INTO t SELECT * FROM t2 AS OF SYSTEM TIME <time>
or two subselects in the same statement with differing AS OF SYSTEM TIME
arguments.
The use of tables with very large primary or secondary index keys (>32KB) can result in excessive memory usage. Specifically, if the primary or secondary index key is larger than 32KB the default indexing scheme for storage engine SSTables breaks down and causes the index to be excessively large. The index is pinned in memory by default for performance.
To work around this issue, we recommend limiting the size of primary and secondary keys to 4KB, which you must account for manually. Note that most columns are 8B (exceptions being STRING
and JSON
), which still allows for very complex key structures.
CockroachDB tries to optimize most comparisons operators in WHERE
and HAVING
clauses into constraints on SQL indexes by only accessing selected rows. This is done for LIKE
clauses when a common prefix for all selected rows can be determined in the search pattern (e.g., ... LIKE 'Joe%'
). However, this optimization is not yet available if the ESCAPE
keyword is also used.
TRUNCATE
is not a DML statement, but instead works as a DDL statement. Its limitations are the same as other DDL statements, which are outlined in Online Schema Changes: Limitations
CockroachDB does not currently key-encode JSON values. As a result, tables cannot be ordered by JSONB
/JSON
-typed columns.
Altering the minimum or maximum value of a series does not check the current value of a series. This means that it is possible to silently set the maximum to a value less than, or a minimum value greater than, the current value.
When setting the default_int_size
session variable in a batch of statements such as SET default_int_size='int4'; SELECT 1::IN
, the default_int_size
variable will not take affect until the next statement. This happens because statement parsing takes place asynchronously from statement execution.
As a workaround, set default_int_size
via your database driver, or ensure that SET default_int_size
is in its own statement.
{% include {{ page.version.version }}/known-limitations/copy-from-clients.md %}
{% include {{ page.version.version }}/known-limitations/copy-syntax.md %}
{% include {{ page.version.version }}/known-limitations/import-high-disk-contention.md %}
{% include {{ page.version.version }}/known-limitations/partitioning-with-placeholders.md %}
It is currently not possible to add a column to a table when the column uses a sequence as the DEFAULT
value, for example:
{% include copy-clipboard.html %}
> CREATE TABLE t (x INT);
{% include copy-clipboard.html %}
> INSERT INTO t(x) VALUES (1), (2), (3);
{% include copy-clipboard.html %}
> CREATE SEQUENCE s;
{% include copy-clipboard.html %}
> ALTER TABLE t ADD COLUMN y INT DEFAULT nextval('s');
ERROR: nextval(): unimplemented: cannot evaluate scalar expressions containing sequence operations in this context
SQLSTATE: 0A000
{% include {{ page.version.version }}/misc/available-capacity-metric.md %}
{% include {{ page.version.version }}/known-limitations/schema-changes-within-transactions.md %}
Schema change DDL statements inside a multi-statement transaction can fail while other statements succeed
{% include {{ page.version.version }}/known-limitations/schema-change-ddl-inside-multi-statement-transactions.md %}
{% include {{ page.version.version }}/known-limitations/schema-changes-between-prepared-statements.md %}
When inserting/updating all columns of a table, and the table has no secondary indexes, we recommend using an UPSERT
statement instead of the equivalent INSERT ON CONFLICT
statement. Whereas INSERT ON CONFLICT
always performs a read to determine the necessary writes, the UPSERT
statement writes without reading, making it faster.
This issue is particularly relevant when using a simple SQL table of two columns to simulate direct KV access. In this case, be sure to use the UPSERT
statement.
CockroachDB imposes a hard limit of 16MiB on the data input for a single statement passed to CockroachDB from a client (including the SQL shell). We do not recommend attempting to execute statements from clients with large input.
In the built-in SQL shell, using the \|
operator to perform a large number of inputs from a file can cause the server to close the connection. This is because \|
sends the entire file as a single query to the server, which can exceed the upper bound on the size of a packet the server can accept from any client (16MB).
As a workaround, execute the file from the command line with cat data.sql | cockroach sql
instead of from within the interactive shell.
When executing an ALTER TABLE ADD COLUMN
statement with a DEFAULT
expression, new values generated:
- use the default search path regardless of the search path configured in the current session via
SET SEARCH_PATH
. - use the UTC time zone regardless of the time zone configured in the current session via
SET TIME ZONE
. - have no default database regardless of the default database configured in the current session via
SET DATABASE
, so you must specify the database of any tables they reference. - use the transaction timestamp for the
statement_timestamp()
function regardless of the time at which theALTER
statement was issued.
When nodes are started with the --locality
flag, CockroachDB attempts to place the replica lease holder (the replica that client requests are forwarded to) on the node closest to the source of the request. This means as client requests move geographically, so too does the replica lease holder.
However, you might see increased latency caused by a consistently high rate of lease transfers between datacenters in the following case:
- Your cluster runs in datacenters which are very different distances away from each other.
- Each node was started with a single tier of
--locality
, e.g.,--locality=datacenter=a
. - Most client requests get sent to a single datacenter because that's where all your application traffic is.
To detect if this is happening, open the DB Console, select the Queues dashboard, hover over the Replication Queue graph, and check the Leases Transferred / second data point. If the value is consistently larger than 0, you should consider stopping and restarting each node with additional tiers of locality to improve request latency.
For example, let's say that latency is 10ms from nodes in datacenter A to nodes in datacenter B but is 100ms from nodes in datacenter A to nodes in datacenter C. To ensure A's and B's relative proximity is factored into lease holder rebalancing, you could restart the nodes in datacenter A and B with a common region, --locality=region=foo,datacenter=a
and --locality=region=foo,datacenter=b
, while restarting nodes in datacenter C with a different region, --locality=region=bar,datacenter=c
.
Many string operations are not properly overloaded for collated strings, for example:
{% include copy-clipboard.html %}
> SELECT 'string1' || 'string2';
?column?
------------------
string1string2
(1 row)
{% include copy-clipboard.html %}
> SELECT ('string1' collate en) || ('string2' collate en);
pq: unsupported binary operator: <collatedstring{en}> || <collatedstring{en}>
When creating or updating a row, if the combined size of all values in a single column family exceeds the max range size (512 MiB by default) for the table, the operation may fail, or cluster performance may suffer.
As a workaround, you can either manually split a table's columns into multiple column families, or you can create a table-specific zone configuration with an increased max range size.
When a node has both a high number of client connections and running queries, the node may crash due to memory exhaustion. This is due to CockroachDB not accurately limiting the number of clients and queries based on the amount of available RAM on the node.
To prevent memory exhaustion, monitor each node's memory usage and ensure there is some margin between maximum CockroachDB memory usage and available system RAM. For more details about memory usage in CockroachDB, see this blog post.
Given a query like SELECT * FROM foo WHERE a > 1 OR b > 2
, even if there are appropriate indexes to satisfy both a > 1
and b > 2
, the query planner performs a full table or index scan because it cannot use both conditions at once.
Every DELETE
or UPDATE
statement constructs a SELECT
statement, even when no WHERE
clause is involved. As a result, the user executing DELETE
or UPDATE
requires both the DELETE
and SELECT
or UPDATE
and SELECT
privileges on the table.
Transactions with priority HIGH
that contain DDL and ROLLBACK TO SAVEPOINT
are not supported, as they could result in a deadlock. For example:
> BEGIN PRIORITY HIGH; SAVEPOINT s; CREATE TABLE t(x INT); ROLLBACK TO SAVEPOINT s;
ERROR: unimplemented: cannot use ROLLBACK TO SAVEPOINT in a HIGH PRIORITY transaction containing DDL
SQLSTATE: 0A000
HINT: You have attempted to use a feature that is not yet implemented.
See: https://github.com/cockroachdb/cockroach/issues/46414
The built-in SQL shell stores its command history in a single file by default (.cockroachsql_history
). When running multiple instances of the SQL shell on the same machine. Therefore, each shell's command history can get overwritten in unexpected ways.
As a workaround, set the COCKROACH_SQL_CLI_HISTORY
environment variable to different values for the two different shells, for example:
{% include copy-clipboard.html %}
$ export COCKROACH_SQL_CLI_HISTORY=.cockroachsql_history_shell_1
{% include copy-clipboard.html %}
$ export COCKROACH_SQL_CLI_HISTORY=.cockroachsql_history_shell_2
CockroachDB does not allow passwords with special characters to be passed as a connection parameter to cockroach
commands.
CockroachDB servers rely on the network to report when a TCP connection fails. In most scenarios when a connection fails, the network immediately reports a connection failure, resulting in a Connection refused
error.
However, if there is no host at the target IP address, or if a firewall rule blocks traffic to the target address and port, a TCP handshake can linger while the client network stack waits for a TCP packet in response to network requests. To work around this kind of scenario, we recommend the following:
- When migrating a node to a new machine, keep the server listening at the previous IP address until the cluster has completed the migration.
- Configure any active network firewalls to allow node-to-node traffic.
- Verify that orchestration tools (e.g., Kubernetes) are configured to use the correct network connection information.
Some schema changes that drop columns cannot be rolled back properly.
In some cases, the rollback will succeed, but the column data might be partially or totally missing, or stale due to the asynchronous nature of the schema change.
In other cases, the rollback will fail in such a way that will never be cleaned up properly, leaving the table descriptor in a state where no other schema changes can be run successfully.
To reduce the chance that a column drop will roll back incorrectly:
-
Perform column drops in transactions separate from other schema changes. This ensures that other schema change failures won't cause the column drop to be rolled back.
-
Drop all constraints (including unique indexes) on the column in a separate transaction, before dropping the column.
-
Drop any default values or computed expressions on a column before attempting to drop the column. This prevents conflicts between constraints and default/computed values during a column drop rollback.
If you think a rollback of a column-dropping schema change has occurred, check the jobs table. Schema changes with an error prefaced by cannot be reverted, manual cleanup may be required
might require manual intervention.
If the execution of a join query exceeds the limit set for memory-buffering operations (i.e., the value set for the sql.distsql.temp_storage.workmem
cluster setting), CockroachDB will spill the intermediate results of computation to disk. If the join operation spills to disk, and at least one of the equality columns is of type JSON
, CockroachDB returns the error unable to encode table key: *tree.DJSON
. If the memory limit is not reached, then the query will be processed without error.
{% include {{ page.version.version }}/known-limitations/unordered-distinct-operations.md %}
CockroachDB does not support partitioning inverted indexes, including spatial indexes.
{% include {{ page.version.version }}/known-limitations/backup-interleaved.md %}